A machine learning-based recommender for news providers
This prototype automatically suggests related news content. The prototype takes in one or more news feeds, which are then parsed using Universal Feed Parser and converted to a pandas dataframe. It then extracts a headline from a randomly selected news story and performs NER (named entity recognition) using a natural language processing library called spaCy and the en-core-sm english language model. NER is the task of identifying named entities, such as people, places, and organizations, in a given text. Finally, it uses the CountVectorizer class and cosine_similarity method of a machine learning library called sci-kit learn to create a document-term matrix and identify similar headlines.