Forbes: Using Data Science in the Journalism Industry

In the age of digital media, the performance of articles––in terms of readership or other user engagements––is directly measurable and quantifiable. This makes it an important metric for publishers, with consequences that are both reputational and financial.


The goal of this project was to better understand the factors that influence an article’s appeal to readers, which can then be used to inform business and editorial decisions at the pre-publication stage. Specifically, Forbes was motivated to learn why similar articles that address the same topics and audiences can have drastically different performances.

Forbes was primarily focused on the intrinsic properties of articles, not on external factors that occur post-publication such as promotional campaigns on social media. Examples of such properties include an article’s headline, metadata, subject matter, vocabulary, writing style, accompanying media, and publication time.