LG Ham­burg: TDM sci­ence bar­ri­er for a public trai­ning database

The Ham­burg Regio­nal Court (LG) has ruled (Judgment of 27.09.2024, Ref. 310 O 227/23) that the Down­loa­ding a pho­to­graph pro­tec­ted by copy­right by the pro­vi­der of a data set for the Arti­fi­ci­al intel­li­gence trai­ning in the pre­sent case falls under the text-and-data-mining (“TDM”) limi­ta­ti­on for sci­en­ti­fic pur­po­ses pur­su­ant to Sec­tion 60d of the Ger­man Copy­right Act (UrhG). The trai­ning of an AI its­elf was not the sub­ject of the ruling.

The defen­dant is the non-pro­fit rese­arch net­work Lai­on (for “Lar­ge-Sca­le Arti­fi­ci­al Intel­li­gence Open Net­work”). It pro­vi­des a data set for the trai­ning of AI models, among other things, publicly and free of char­ge. The set con­ta­ins almost 6 bil­li­on. links to publicly acce­s­si­ble images with a descrip­ti­on of the image con­tent. The defen­dant had down­loa­ded the images lin­ked in a pre-exi­sting data set, used soft­ware to check whe­ther the respec­ti­ve descrip­ti­on was cor­rect and enri­ched the images with meta­da­ta befo­re publi­shing the data set. An image pro­vi­ded by a pic­tu­re agen­cy on the Inter­net with a water­mark from the agen­cy was also affec­ted. The­re was an objec­tion to scra­ping on the agency’s website.

Against this back­ground, the Regio­nal Court asses­sed the admis­si­bi­li­ty under copy­right law as follows:

No mere­ly eph­eme­ral or accom­pany­ing reproduction

Not rele­vant was § Sec­tion 44a UrhGwhich exempts a repro­duc­tion that flee­ting or accom­pany­ing is an inte­gral and essen­ti­al part of a tech­ni­cal pro­cess, ser­ves only for trans­mis­si­on in a net­work or the lawful use of a work and has no inde­pen­dent eco­no­mic significance:

  • The dupli­ca­ti­on was non-vola­ti­lebecau­se it is not user-inde­pen­dent, but only based on cor­re­spon­ding pro­gramming by the pro­vi­der; moreo­ver, the defen­dant had said not­hing about the sto­rage period;
  • she was also not accom­pany­ingbecau­se images were down­loa­ded spe­ci­fi­cal­ly for an ana­ly­sis, i.e. in a con­scious and acti­ve pro­cu­re­ment pri­or to the analysis.

Under Swiss copy­right law, the situa­ti­on should be asses­sed in the same way. Art. 24a CopA exempts tem­po­ra­ry repro­duc­tion under the same con­di­ti­ons as Sec­tion 44a UrhG. The repro­duc­tion of copy­righ­ted works in a data set for the pur­po­se of trai­ning an AI model would hard­ly be cover­ed by this (see our FAQ on the AI Actque­sti­on 59).

Appli­ca­ti­on of the TDM sci­ence barrier

The Ger­man Copy­right Act regu­la­tes the exemp­ti­on of repro­duc­tion for “Text and Data Mining” (TDM) in two provisions:

  • § Sec­tion 60d UrhG allo­ws TDM and other faci­li­ties that non-com­mer­cial sci­en­ti­fic Con­duct research.
  • § Sec­tion 44b UrhG con­ta­ins a Gene­ral bar­ri­er pro­vi­si­on for TDM also out­side of non-com­mer­cial rese­arch, but under the Reser­va­ti­on of a right of use for publicly acce­s­si­ble works (and with a dele­ti­on obli­ga­ti­on that does not app­ly in the case of Sec­tion 60d UrhG).

Unli­ke § 44b UrhG, § 60d UrhG is rele­vant. The repro­duc­tion took place within the frame­work of a TDM. TDM is the auto­ma­ted ana­ly­sis of digi­tal works in order to obtain infor­ma­ti­on, par­ti­cu­lar­ly about pat­terns, trends and cor­re­la­ti­ons. This is true: the dupli­ca­ti­on was used to find “cor­re­la­ti­ons”, name­ly tho­se bet­ween image con­tent and image description.

In this case, the TDM from Lai­on For the pur­po­ses of sci­en­ti­fic rese­arch:

The con­cept of sci­en­ti­fic rese­arch, by allo­wing the metho­di­cal and syste­ma­tic “pur­su­it” of new know­ledge to suf­fice, is not to be under­s­tood so nar­row­ly that it only covers the work steps direct­ly asso­cia­ted with the acqui­si­ti­on of know­ledge. it is suf­fi­ci­ent that the work step in que­sti­on is aimed at a (later) gain in know­ledge […]. In par­ti­cu­lar, the con­cept of sci­en­ti­fic rese­arch does not pre­sup­po­se any sub­se­quent rese­arch success.

This could also cover the trai­ning of an AI:

Alt­hough the crea­ti­on of the data set as such may not yet be asso­cia­ted with a gain in know­ledge, it is a fun­da­men­tal work step with the aim of using the data set for the pur­po­se of gai­ning know­ledge at a later date. It can be affirm­ed that such an objec­ti­ve also exi­sted in the pre­sent case. It is suf­fi­ci­ent that the Data set – undis­pu­ted – published free of char­ge and thus (also) made available to rese­ar­chers in the field of arti­fi­ci­al neu­ral networks.

It was the­r­e­fo­re not rele­vant whe­ther the deve­lo­p­ment of the defendant’s own AI models con­sti­tu­ted the defendant’s own research:

Whe­ther the data set […] will also be used by com­mer­cial com­pa­nies for trai­ning or for the fur­ther deve­lo­p­ment of their AI systems is alre­a­dy irrele­vant, becau­se rese­arch by com­mer­cial com­pa­nies is also still rese­arch – even if not pri­vi­le­ged as such under Sec­tions 60c et seq. UrhG - is.

Howe­ver, the pri­vi­le­ged sta­tus of Sec­tion 60d UrhG only applies to the Non-com­mer­cial rese­arch. This was ful­fil­led in the pre­sent case becau­se the defen­dant made the data­ba­se publicly available free of charge.

The Swiss URG con­ta­ins TDM only in the con­text of sci­ence, with Art. 24d CopA:

1 For the pur­po­se of sci­en­ti­fic rese­arch, it is per­mis­si­ble to repro­du­ce a work if the repro­duc­tion is con­di­tio­nal on the use of a tech­ni­cal pro­cess and the­re is lawful access to the works to be reproduced.

2 Repro­duc­tions made in the con­text of this Artic­le may be retai­ned for archi­ving and back­up pur­po­ses after com­ple­ti­on of the sci­en­ti­fic research.

3 This Artic­le shall not app­ly to the repro­duc­tion of com­pu­ter programs.

It is quite obvious to assess the situa­ti­on in the same way as the Ham­burg Regio­nal Court here. The con­cept of rese­arch is not nar­rower, but on the con­tra­ry also inclu­des com­mer­cial rese­arch (which is only exempt­ed under Sec­tion 44b of the Ger­man Copy­right Act). It can hard­ly be argued that every trai­ning of an AI is cover­ed by this (not every trai­ning is likely to be aimed at gai­ning know­ledge), but in the par­ti­cu­lar cir­cum­stances of the pre­sent case, Art. 24d CopA is also likely to apply.

Rather no appli­ca­ti­on of the gene­ral TDM barrier

On the other hand, the Gene­ral TDM bar­ri­er of Sec­tion 44b UrhG. The gene­ral requi­re­ments are ful­fil­led (obiter dic­ta):

Whe­ther the TDM bar­ri­er only covers the explo­ita­ti­on of “infor­ma­ti­on hid­den in the data” and not also the use of “the con­tent of the intellec­tu­al crea­ti­on”, which is “occa­sio­nal­ly advo­ca­ted”, is doub­ted by the Regio­nal Court (obiter, sin­ce Sec­tion 60d UrhG alre­a­dy applies):

  • This distinc­tion is justi­fi­ed in the lite­ra­tu­re by the fact that the trai­ning of an AI ulti­m­ate­ly ser­ves to gene­ra­te new image con­tent with the AI, which is why Sec­tion 44b UrhG applies accor­din­gly. to redu­ce teleo­lo­gi­cal­ly is. Howe­ver, this inten­ti­on and the suc­cess of the trai­ning have not yet been estab­lished, as the LG states.
  • In addi­ti­on, it fol­lows from Art. 53 para. 1 lit. c of the AI Actthat the TDM bar­ri­er under Euro­pean law can at least cover trai­ning (GPAIM pro­vi­ders must, among other things, have a “Uni­on copy­right com­pli­ance poli­cy” which also inclu­des the use of a “copy­right manage­ment system estab­lished in accordance with Artic­le 4(3) of the Direc­ti­ve (EU) 2019/790 The TDM bar­ri­er), and §44b UrhG imple­ments this provision.

The fol­lo­wing also had to be taken into account Info­soc Direc­ti­vethe Direc­ti­ve on the har­mo­nizati­on of cer­tain aspects of copy­right in the infor­ma­ti­on society:

  • Its Art. 5 (5) per­mits the appli­ca­ti­on of the TDM bar­ri­er only in spe­cial cases whe­re the nor­mal explo­ita­ti­on of the work is not impai­red and the Inte­rests of the rights hol­der are not undu­ly violated.
  • This is also the case here – in par­ti­cu­lar, the pos­si­bi­li­ty of com­pe­ti­ti­on from AI-gene­ra­ted con­tent is not suf­fi­ci­ent, if only becau­se future, not yet fore­seeable deve­lo­p­ments would not allow a legal­ly secu­re distinc­tion bet­ween per­mis­si­ble and imper­mis­si­ble uses.

Final­ly, the down­loa­ded works were also Legal­ly acce­s­si­bleas requi­red by §44b UrhG. It was not the ori­gi­nal image offe­red only under licen­se that was down­loa­ded, but a water­mark­ed pre­view image posted for adver­ti­sing purposes.

Howe­ver, the appli­ca­ti­on of Sec­tion 44b UrhG is likely to lack an effec­ti­ve Reser­va­ti­on of use fail (also here obiter):

  • The reser­va­ti­on of use had been declared by the pic­tu­re agen­cy, which was aut­ho­ri­zed to do so as the rights hol­der, and the plain­ti­ff as the rights hol­der should be able to invo­ke it.
  • The reser­va­ti­on was for­mu­la­ted cle­ar­ly enough. The fact that it con­cer­ned all published works does not con­tra­dict this.
  • He was pro­ba­b­ly also machi­ne-rea­da­ble. “Machi­ne-rea­da­ble” should be inter­pre­ted as “machi­ne-under­stan­da­ble”. In the opi­ni­on of the Regio­nal Court, a reser­va­ti­on writ­ten in natu­ral lan­guage should also suf­fice, becau­se such reser­va­tions are at least machi­ne-rea­da­ble with a cor­re­spon­ding AI (the Regio­nal Court again refers to Art. 53 para. 1 lit. c AIA, accor­ding to which the stra­tegy of the pro­vi­der of a GPAIM to com­ply with copy­right law also inclu­des the “iden­ti­fi­ca­ti­on of and com­pli­ance with a […] reser­va­ti­on of rights”). also through the latest tech­no­lo­gies” inclu­des”. Howe­ver, the Regio­nal Court points out that it is pro­ba­b­ly going against a majo­ri­ty opi­ni­on here. Ulti­m­ate­ly, howe­ver, the que­sti­on can be left open.

Aut­ho­ri­ty

Area

Topics

Rela­ted articles

Sub­scri­be