Building news ranking and recommendation systems is more challenging than it seems on the surface. Sometimes press releases can be mixed in with news. Or, there’s a lot of opinion hiding amidst news articles that would be helpful to be identified correctly as opinion, and not news. And maybe what seems important in one culture or language is not as important in another. With these and other considerations in mind, how should news algorithms work, especially amid the pitfalls of misinformation, disinformation, and partisan bias? And what is news, anyway?
To address these challenges, on October 16 and 17, 2020 students teamed up and participated in the NewsQ for Social Good Hackathon. At the end of the hackathon, eight winning teams were chosen, based on high-potential projects that prototyped news ranking and recommendation systems that incorporated Social Good.
Organized in collaboration between the Georgia Tech Center for Computing and Societies and the NewsQ Initiative, the NewsQ for Social Good hackathon challenged teams of students to prototype a news ranking and recommendation algorithm/service that surfaces articles from existing news and information online from a country of their choice.
Information for the prototypes could be gathered from existing news sites, social media platforms such as Twitter, YouTube, Reddit, and others, or a combination of both sources. Each team chose a country that was not the United States, and, because of the Social Good aspect of the hackathon, NewsQ hackathon participants were encouraged to organize in interdisciplinary teams. According to hackathon parameters, each prototype was expected to result in a list of ranks, a design document, and a work product.
The NewsQ for Social Good Hackathon was held as part of the larger HackGT 7 hackathon organized by undergraduates and hosted by Georgia Tech that attracted about one thousand participants from all over the world.
NewsQ for Social Good Hackathon Winners
Eight winning teams were chosen by the end of the hackathon. Here are some short descriptions about the different projects provided by the teams themselves:
1st Place (Shared)
Cleannews developed a prototype web application that provides unbiased, verified, and analyzed news in India. The prototype filters news articles to native Indian publications about politics, determining veracity, bias, and what could be clickbait, based on the content of the articles. The prototype utilizes Google’sTensorFlow to present articles with diverse viewpoints and eliminate articles with hate speech and offensive content, and then uses this data and corresponding weights to rank the news articles in terms of quality. Finally, Cleannews performs sentiment analysis to contextualize articles to users, keyword analysis to scrape the key terms, and then present all of the information in a clean, concise, and readable way.
Focusing on media in Thailand, Newsly allows users to customize their feeds based on several factors the team believes are important to democratic consumption, such as sentiment analysis, peer reputation, news citations, political bias, Twitter article engagement, social bot promotion ratios, and even advertising aggressiveness. Newsly was built with a Postgres database, Flask server, and HTML/CSS frontend. The Newsly team drew on several open source resources as well, including but not limited to Media Cloud, MediaRank, Twitter, and PyTorch.
2nd Place (Shared)
CanadaRank prototyped a bot that can effectively rank news articles about Canada’s breaking news sector using Natural Language Processing, along with other methods.
Using Uzbekistan media as a model, Duv-Duv Gap prototyped a framework for low-resource languages in order to use historical data of news articles and related metadata and predict the user engagement of new articles.
3rd Place (Shared)
Focusing on the United Kingdom, got bias? filters news articles and ranks them according to a base algorithm that favors unbiased writers and reliable sources. On the “got bias?” website, a list of the top-ranked news sources is displayed. The algorithm also takes news-date into account, so the most recent news is always near the top of the rankings.
The sentiment analysis model was trained with data that was web-scraped from news articles, and used Naive Bayes classifiers to analyze for bias. The ranked articles were then input into an Microsoft Excel spreadsheet.
HackNews prototyped a web application that allows users to vote the articles as “legitimate” or “fabricated,”as well as to share comments to exercise their freedom of speech. HackNews ranks the credibility of the news according to the voting result from users. The project focused on Yemen.
The prototype aimed to create a strong backend service that is capable of rating any relevant news article. Based on a trained data set from media sources in the United Kingdom, selected news articles were each assigned a legitimacy score. Several factors went into determining the legitimacy score of a given article, such as the number of quotes, the tone, the number of typos, the number of offensive words, the number of links the article has, and the word count. To provide a basic visualization of our results, the team created a webpage that neatly lists the ultimate rankings.
Because of the project’s focus on COVID-19, NewsReel was also selected as a winner of the IBM: Community Response to COVID-19 challenge.
Focusing on the United Kingdom, JAVE developed Newsworthy. Newsworthy uses Natural Language Processing to determine whether or not a news article is untruthful based on a pre-trained model of vocabulary. To improve accuracy, the team decided to add a user consideration as an added heuristic. The application takes the URL of an article selected by the user and applies the initial NLP heuristic to the article. Next, users can upvote or downvote articles depending on the sentiment they may feel on the article. Using a unique algorithm combining both heuristics the website outputs a score thus ranking the article. Articles with a higher score bubble to the top where those with a lower score fall to the bottom.
Winners in Other Categories
While these two projects didn’t win NewsQ prizes, we’re very glad that they were recognized with other awards from HackGT 7.
Optimizing News Recommendation Algorithms
(Winner of Emerging Hackers: Best in Curriculum Track)
(Winner of MLH: Best use of DataStax Astra)
Congratulations to the NewsQ Hackathon Winners!
The NewsQ team was surprised and delighted by the amount of student interest in this hackathon challenge. We would like to congratulate all participants for answering the challenge of how to “ideate” or foster new ideas that help clarify what constitutes – and how to define – quality in news ranking and recommendation.
We’re looking forward to how all hackathon participants will bring their creativity to future social good challenges in years to come!