Why Google's search engine trial is about AI(2)

728x90

April 29, 20253:59 PM ET

<News>

AI and search are connected like this: Search engine indices are essentially giant databases of pages and information on the web.

Google has its index, which contains hundreds of billions of web pages and is over 100,000,000 gigabytes, according to court documents.

This is the data Google's search engine scans when responding to a user's query.

AI developers can use these kinds of databases to build and train the models used to power chatbots.

In court, attorneys for the DOJ have argued that Google's Gemini pulls information from the company's search index, including citing search links and results, extending what they say is a self-serving cycle.

They argue that Google's ability to monopolize the search market gives it user data, at a huge scale — an advantage over other AI developers.

The Justice Department argues Google's monopoly over search could have a direct effect on the development of generative AI, a type of artificial intelligence that uses existing data to create new content like text, videos or photos, based on a user's prompts or questions.

Last week, the government called executives from several major AI companies, like OpenAI and Perplexity, in an attempt to argue that Google's stranglehold on search is preventing some of those companies from truly growing.

The government argues that to level the playing field, Google should be forced to open its search data — like users' search queries, clicks, and results — and license it to other competitors at a cost.

This is on top of demands related to Google's search engine business, most notably that it should be forced to sell off its Chrome browser.

<Vocabulary>

*search engine indices: 검색 엔진 인덱스/색인

*giant databases of pages and information on the web: 웹상에 존재하는 수많은 페이지와 정보를 저장한 거대한 데이터베이스

*according to court documents: 법원 문서에 따르면

*when responding to a user's query: 사용자의 질문에 응답할 때

*build and train the models used to power chatbots: 챗봇을 구동시키기 위한 모델을 만들고 훈련시키다

*pull information from the company's search index: 그 회사(구글)의 검색 색인에서 정보를 가져오다

*self-serving: 자기 잇속만 차리는; 셀프 서비스의

*extending what they say is a self-serving cycle: 소위 셀프 서비스적인 순환구조를 더 길게 만들면서

*at/on a huge scale: 대규모로, 방대하게

*an advantage over other AI developers: 다른 AI 개발자들보다 나은 이점

*a direct effect on the development of generative AI, a type of artificial intelligence: 인공지능의 한 유형인, 생성형 AI 발달에 직접적인 영향

*a user's prompts or questions: 사용자의 명령이나 질문들

*Google's stranglehold on search: 검색(시장)에서 구글의 목조르기; 완전한 지배

*to level the playing field: 경쟁의 장을 대등하게 하다, 경쟁의 장을 평평하게 하다

*at a cost of ~ : ~의 대가를 지불하여, ~의 비용으로

*license: (공적으로) 허가하다

*license it to other competitors at a cost: 다른 경쟁자들에게 허가하여 대가를 지불하게 하다

<Translation>

AI와 검색은 다음과 같이 연결되어 있습니다.

검색 엔진 인덱스는 본질적으로 웹상에 존재하는 수많은 페이지와 정보를 저장한 거대한 데이터베이스입니다.

구글은 자사만의 인덱스를 보유하고 있으며, 법원 문서에 따르면 이 인덱스는 수천억 개의 웹페이지를 포함하고 있고, 용량은 1억 기가바이트(100,000,000GB)를 넘습니다.

구글 검색 엔진은 사용자의 질문에 응답할 때 이 인덱스를 스캔합니다.

AI 개발자들은 이러한 종류의 데이터베이스를 활용하여 챗봇을 구동하는 모델을 구축하고 훈련할 수 있습니다.

법정에서 법무부 변호인들은 구글의 제미니가 자사 검색 인덱스에서 정보를 가져와 활용하고 있으며, 검색 결과나 링크를 인용하면서 이른바 자기 강화적인 순환 구조를 만든다고 주장했습니다.

즉, 구글이 검색 시장을 독점함으로써 방대한 규모의 사용자 데이터를 확보하고 있으며, 이는 다른 AI 개발자들에 비해 큰 이점을 준다는 것입니다.

법무부는 구글의 검색 독점이 생성형 AI의 발전에 직접적인 영향을 줄 수 있다고 주장합니다.

생성형 AI는 기존 데이터를 활용해 텍스트, 영상, 이미지 등 새로운 콘텐츠를 생성하는 인공지능 기술로, 사용자의 질문이나 명령을 기반으로 작동합니다.

지난주 정부는 OpenAI, Perplexity 등 주요 AI 기업의 경영진들을 소환하여, 구글이 검색 시장을 장악함으로써 일부 기업들의 실질적인 성장을 가로막고 있다고 강조했습니다.

정부는 경쟁의 장을 평평하게 만들기 위해 구글이 자사의 검색 데이터를 — 예: 사용자 검색어, 클릭, 검색 결과 — 타 경쟁사에 비용을 받고 라이선스하도록 강제해야 한다고 주장합니다.

이는 구글의 검색 사업 관련 요구 사항에 더해진 내용이며, 대표적으로는 구글이 크롬 브라우저를 매각해야 한다는 요구도 포함됩니다.

728x90

englishportal 님의 블로그

Why Google's search engine trial is about AI(2)

<News>

<Vocabulary>

<Translation>

티스토리툴바