• Send Us A Tip
  • Calling all Tech Writers
  • Advertise
Saturday, June 27, 2026
  • Login
TechStory
  • News
  • Crypto
  • Gadgets
  • Memes
  • Gaming
  • Cars
  • AI
  • Startups
  • Markets
  • How to
No Result
View All Result
  • News
  • Crypto
  • Gadgets
  • Memes
  • Gaming
  • Cars
  • AI
  • Startups
  • Markets
  • How to
No Result
View All Result
TechStory
No Result
View All Result
Home News

Activists create searchable index of 107 million science articles

by Manasi Varma
October 15, 2021
in News, Popular, World
Reading Time: 2 mins read
0
Activists create searchable index of 107 million science articles
TwitterWhatsappLinkedin

In what may serve as great news for knowledge-seekers, activists have created a searchable index of over 107 million science articles. A total different 107,233,728 papers have been cataloged into a General Index, which is said to have a size of a whopping 38 terabytes.

You might also like

The Exile of the Architect Wikipedia Co-Founder Indefinitely Banned from Editing the Website

Pax Silica Anthropic Claims Alibaba Defied Warnings to Attack Claude and Steal Capabilities

Apple Price Hike: MacBooks and iPads Cost More, But iPhones Get a Pass

The Index is a searchable collection of short sentences and keywords from published articles, which can be used to provide a gateway to scientific knowledge. The compressed version alone has a size of 8.5 terabytes, and can be accessed through archive.org, in a process that, despite being direct, is rather cumbersome.

Keywords and N-Grams to Help You Track Articles

But the world is full of nice people, as the data has been uploaded to a remote server by users on the /r/DataHoarder subreddit, who are also in the process of spreading it across BitTorrent.

However, it may be noted that the General Index doesn’t contain the journal articles in their entirety, and instead, it has only the keywords and n-grams (strings of simple phrases that contain a keyword). These phrases are known to make tracking articles easier.

Work in Progress

Public.Resource.org founder and General Index co-creator Carl Malamud has said that the version that has been released is still a “work in progress.” He has highlighted how the process was not sman easy one, with text extraction failing sometimes, and metadata not being available at other times. This, he says, is the reason why the “corpus,” despite being large, is still not up to date and complete.

Nevertheless, General Index represents, at least to Malamud, a “lookup tool, a dictionary of knowledge, a map to knowledge,” something he believes is necessary to the modern practice of science. He further adds that his team views the database as a public utility, and is not keen to assert any type of ownership of the tool.

Trying to Avoid the Law

Activists create searchable index of 107 million science articles
Image Credits: The Verge

At the same time though, publicly sharing scientific articles that exist behind paywalls was, and still is, illegal. Take for example Sci-Hub, which has been facing the ire of world governments for years. Malamud is hoping that with General Index, they will be able to enter the public domain thanks to the innovative approach they have taken to their database. Nevertheless, he has also landed in trouble for similar efforts, when the State of Georgia accused him of terrorism and sued him, following his attempts to post the State’s laws online for the world to read.

Tags: Carl MalamudGeneral IndexPublic.Resource.orgSci-Hub
Tweet54SendShare15
Previous Post

Delhivery appoints three new independent directors ahead of IPO

Next Post

SEBI has cleared the decks for Mobikwik’s Rs 1,900 crore IPO

Manasi Varma

A 20-something year old with a flair for writing, a love for reading, and an obsession for KPop. Most amicable person you'll ever meet.

Recommended For You

The Exile of the Architect Wikipedia Co-Founder Indefinitely Banned from Editing the Website

by Anochie Esther
June 27, 2026
0
Wikipedia account ban

The structural core of the modern digital information economy is built upon a delicate, often volatile experiment in open-source collaboration. For a quarter of a century, the primary...

Read more

Pax Silica Anthropic Claims Alibaba Defied Warnings to Attack Claude and Steal Capabilities

by Anochie Esther
June 27, 2026
0
Anthropic's $965 billion valuation

The geopolitical cold war over artificial intelligence has officially escalated from chip supply bans into open industrial-scale data warfare. For the past year, the United States and China...

Read more

Apple Price Hike: MacBooks and iPads Cost More, But iPhones Get a Pass

by Rounak Majumdar
June 26, 2026
0
Apple Price Hike: MacBooks and iPads Cost More, But iPhones Get a Pass

On June 25, 2026, Apple did something unusual for the company: it hiked pricing on a wide variety of its items in the middle of the cycle, with...

Read more
Next Post
Mobikwik official Logo

SEBI has cleared the decks for Mobikwik’s Rs 1,900 crore IPO

Please login to join discussion

Techstory

Tech and Business News from around the world. Follow along for latest in the world of Tech, AI, Crypto, EVs, Business Personalities and more.
reach us at info@techstory.in

Advertise With Us

Reach out at - info@techstory.in

Aviator Game India 2026

BROWSE BY TAG

#Crypto #howto 2024 acquisition AI amazon Apple Artificial Intelligence bitcoin Business China cryptocurrency e-commerce electric vehicles Elon Musk Ethereum facebook funding Gaming Google India Instagram Investment ios iPhone IPO Market Markets Meta Microsoft News OpenAI samsung Social Media SpaceX startup startups tech technology Tesla TikTok trend trending twitter US

© 2025 Techstory.in

No Result
View All Result
  • News
  • Crypto
  • Gadgets
  • Memes
  • Gaming
  • Cars
  • AI
  • Startups
  • Markets
  • How to

© 2025 Techstory.in

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?