We begin by collecting publicly available data from dozens of sources, starting with the article and outlet ("primary"), and extending outward to social networks, trade organizations, media lists ("related") and other repositories of professionally relevant information ("ancillary").

Primary Sources
  • Published news
  • RSS feeds
  • Transcripts
  • Metadata
  • Closed captioning
  • Newsletters
  • Podcasts
  • Google alerts
  • Social media
Related Sources
  • Mastheads
  • LinkedIn
  • Professional memberships
  • Public social media profiles
  • Web forums
  • Media lists
  • Wikipedia/Wikimedia
  • Amazon/book lists
  • Public records
Ancillary Sources
  • Honors, awards
  • Public offices
  • Grants, institutional support
  • School, university records
  • Crowdfunding campaigns
  • Public company records
  • Social media connections
  • Non-media membership rolls
  • User-submitted data

This data is verified, indexed and organized at the article level, and merged into raw indices that are stored on encrypted cloud servers.

Primary Data
Related Data
Ancillary Data
Raw Data


Raw data is sent to our proprietary parsing engine, which uses machine learning, pattern recognition, natural-language processing and other trendy pieces of technology to make sense of this digital slush pile.

During this process, we implement strict privacy protection measures to prevent sensitive and irrelevant personal information from going public.

Even if a journalist were to volunteer personal details that we don't consider to be professionally relevant — a child's photo on a public social media account, for example — our analysis errs on the side of privacy.

The output is organized into discrete units of data that adhere to the ​Universal News Protocol.

Raw Data
Parsing Engine
  • Entity extraction, categorization and collation
  • Contextual disambiguation
  • Link, hashtag and keyword analysis
  • Related- and weak-signal detection
  • Human-intelligence microtasks
  • On-demand manual correction and verification
Discrete Data


Prior to distribution, these discrete data records are stored on a blockchain network. By employing a decentralized ledger, we benefit from several of blockchain’s features, chief among them being immutability.

However, using blockchain is more than just a technical solution. It's an ethical signal to the industry that we're not claiming ownership over publisher data.

In fact, publishers who opt to upload their production data directly to Pressland's servers are given their own private keys. They continue to own and control their data; they can revoke access at any time; and, whenever this data is accessed by a commercial client, they receive a licensing fee in the form of revenue-sharing.

Discrete Data
Blockchain Storage
  • Immutable records stored on distributed network
  • Inaccurate data easily flagged and suppressed
  • Archived data instantly accessible
  • Private keys held by participating publishers
Application Layer
Direct Publisher Feeds


Pressland’s data is available to partners and third-party developers via an Application Programming Interface (API). Commercial clients pay licensing fees based on data consumption and their intended use of this data.

Qualified nonprofits, NGOs, academics, researchers and media partners receive discounted or free access; basic public access via our website, extensions and future apps will always be free.

Standard safeguards such as data and search limits will be implemented to prevent abuse.

Application Layer
  • SaaS
  • SDK
  • Other data subscriptions
  • White label
  • Plug-ins/extensions
  • Third-party apps
  • Other consumer products