Why ‘Natural Language Processing’ Matters for Effe

By Loic Moisand Every day, millions of people comment online on how they feel about items they’ve bought or are considering buying. As this constant flow of opinion is not limited to the products, but includes thoughts on the brands themselves and their marketing efforts, this information provides a fairly complete picture of a given brand’s image. Such a large quantity of data, however, can be both an advantage and a problem. For instance, a simple search in Synthesio’ s “Flash Dash,” an ad hoc insights tool that quickly analyzes global data, of the week of September 16-22, shows an average of 50,000 comments per day for big brands, such as Microsoft, Samsung and Ford. For humans to analyze all this data would be prohibitively expensive and time consuming. To solve this problem, we use machine learning algorithms, known as Natural Language Processing (NLP). There are two main classes of such algorithms: supervised and unsupervised. Supervised methods learn by example, and are mostly classification systems, such as noise filters or Automatic Sentiment Analysis (ASA). Unsupervised methods try to summarize information from large quantities of data and the most common applications are visualization systems such as word clouds where we can see the most common words and their relations as an image, or key information detection, such as most representative sentence or comments in a group of documents. The focus of this article is on supervised methods, more particularly classification algorithms. One of the first applications we can think of is noise filtering. For example with Apple, the main problem is when a comment has the word “apple,” in it, is it referring to the company, or to the fruit? By presenting examples of what is pertinent and what is not, the classification algorithm will learn how to select only the desired data. A second application is for Automatic Sentiment Analysis (ASA). Its objective is to classify whether a given comment has a sentiment charge or not, and if so, what its tonality is: positive or negative. If the first example serves as a way to remove all non-related comments, ASA systems can help us understand what the perception of a brand or product is over time. Finally, we can also classify the topic of a comment. It can be about the product - price, quality, design - or about the company itself, such as its ecological footprint, marketing campaign or customer service. Combining all of these techniques creates a monitoring system that will be more reliable than ones based solely on keyword search, without the need for an army of analysts. Of course, one can question how reliable these classification systems are when compared with a human. As early as 1979, Dawes presented a publication in American Psychologies, titled “The robust beauty of improper linear models in decision making”. As the article explains, classification produced by linear models is superior to clinical intuition when making decisions. Classification of a text is a decision. NLP systems use these same linear models that make classification decisions based on input parameters. Given a large enough quantity of examples to train the system, the linear model will make good decisions. If linear models, such as logistic regression, are well understood and have been performing well for a significant amount of time, the issue is still the definition of the input parameters; in other words, how to represent the comment’s text. The main focus of NLP is therefore how to represent a comment. Let’s think of an example in medicine: if we want to classify whether a person is obese or not, his/her weight is a weak parameter, whereas BMI is much more accurate. In terms of text classification, it is clear now that the word “the” is of little importance to the classification. Likewise, if we want to select information on Steve Jobs, simply using the word jobs might not be good. In sum, it is not only what words are present in the text, but also what each word means in context. This is especially true for ASA. One example is “the blues brothers” and “having the blues”. Current systems try to get the words in their context and how they connect to other words, as in it is an adjective or a noun and what nouns are connected to verbs. We know now that NLP algorithms, even if difficult to develop, can help us to sort through the massive quantities of data more easily. But in what conditions will the system underperform? The big problem is with unclear information. This can be either by the use of irony and sarcasm in a message, or by the lack of context. Humans can perform relatively well in these conditions, as they have a complete database of knowledge and context experience that allows them to understand the true meaning of each message. Nevertheless, in these situations, not even human users will fully agree on every decision. Another minor problem is the quality of writing. This is particularly true of micro-blogging systems. We need to normalize the messages so that we can understand that “w8ing” equals “waiting”. Use of irony and sarcasm is actively studied in the academic world, but current state-of-the-art systems cannot solve the problem completely. For the situations where these particular difficulties do not exist, we can achieve accuracies of 85% - 95% for noise filters and up to 85% for ASA systems in terms of tonality Now that we removed all the non-relevant data and classified each verbatim into one or more topics, we can start to analyze our data. Our new NLP dashboard for the brand Apple will have, from the 50,000 comments, 30.000 removed as they refer to apple juice and apple pie. From the remaining 20,000 we will detect those that talk of design, price, operating system or new products. Applying Automatic Sentiment Analysis will give us an idea if the talk is mostly positive, negative or neutral. One important metric at our disposal now is how the conversation sentiment evolves over time, and see if it is getting more negative or positive for a specific topic. Finally we can use unsupervised methods to visualize the complete data on topic. Using a word cloud we can see the main words associated with a specific topic. We can imagine that while visualizing the topic price, the word “expensive” show up very often, and it is used in the same context as “quality”. If the word “bad” is also in the context or the sentiment is mostly negative, we should take an in-detail look at what people are saying. Loic Moisand is founder and CEO of Synthesio.