Mine Unstructured Data With AI

With access to powerful computing power and advances in machine learning, unstructured data is becoming easier and cheaper for businesses to turn into usable sources of insight.

Structured data alone isn’t enough to get a full view of an organization. All kinds of valuable patterns and insights are contained in textual content: emails, memos, customer service chats, social media streams, and news articles. The problem is extracting the value from millions of pages of content at high speed and delivering it in a visually interactive format that lets users easily navigate, search, discover correlations within it, federate it with existing structured data, and make predictions about products, topics, events, trends, and even themes and emotions.

Research says that unstructured data may account for 80 to 90 percent of content generated globally, making unstructured data a tremendous source of untapped value. Fortunately, advancements in AI (Artificial Intelligence) and machine learning now make it possible and affordable to sift through and find meaning in vast amounts of unstructured data obtained from video and audio files, emails, logs, social media posts and even notifications from Internet of Things (IoT) devices.

Wissen has the best in class capabilities to mine unstructured data with technologies like Artificial Intelligence (AI), Machine Learning, Natural Language Processing (NLP), Pattern Recognition and Optimization. We have expertise in extraction of information from unstructured and semi-structured data sets. We have the ability to create learning models using cutting edge technology stack and research. We drive Digital Enablement through a combination of NLP and Machine Learning techniques.

For example, we have built an Intelligent Search solution for Research Reports, which combines NLP techniques like Dependency Graph extraction, Word Embeddings and term frequency-inverse document frequency (tf-idf) to return the most relevant results for the clients’ questions, by searching among the vast corpus of research articles. Another example is the solution we have built for Generating Comparisons from News Feeds, where we have combined structured and unstructured news feeds to extract company attributes for comparisons. We have also built a solution for Automated Categorization, where hyper-targeted feeds of companies, investors and acquirers are generated from unstructured news feeds. One more example is the Churn Prediction solution that we have built to come up with a score of Churn Risk and Intent to Buy for every customer, over the next 30 days. The solution uses Machine Learning algorithms and the data universe for the models consists of user interactions, frequency, transaction history, logs, online feedback etc.

Machine Learning

Wissen believes that every application is going to be an intelligent application. We provide clients with expert guidance and roadmap to attain maturity with the technologies and processes. We help clients navigate from Experiments and Proof of Concepts to a place where standards are set; processes are deployed across the firm while the value is continuously delivered and there is a perpetual improvement. Wissen’s philosophy and approach is captured in below diagram, where Wissen has capabilities to implement solution end to end. The capabilities include:

  • Data Management (using tools like Redshift, Hana, Google Bigquery, Vertica, Greenplum, MongoDB (NoSQL), SQL Azure, Hadoop, Apache Spark)
  • Analytics (R, Julia, SAS, TensorFlow, Mahout, Scikit, NLTK, Lucene)
  • Decisions (CPLEX, Gurobi, Matlab, Maple, Mathematica)

Natural Language Understanding (NLU)

Wissen provides NLU solutions that encompass Natural Language Processing and Generation. Our tools process Big Text, revealing valuable information and actionable data and deep value for the users. The solution’s components can be independently used or combined together to create powerful solutions. The solution uses linguistic analysis, statistical modeling, and machine learning. Capabilities of solution include Entity Extraction, Sentiment Analysis, Keyword Extraction, Concept Tagging, Relationship Extraction, Taxonomy Classification, Author Extraction, Language Detection, Text Extraction, Microformats Parsing, Feed Detection, Linked Data Support and Optical Character Recognition (OCR).

Use Cases in NLU

Categorization & Process Automation

  • Automatically categorize incoming requests in unstructured format and map to specific workflow requests
  • For example, automatically handling requests from clients such as change of address, payment instructions etc.
  • Process automation for operations intensive jobs, like Cheque clearance etc.

Search & Customer Interaction

  • Recognition of entities and their relationships from research documents
  • Automation of Process Inquiries, through chatbots, search etc.
  • Providing relevant content to internal research teams
  • Multi-Lingual communication

Legal Documents

  • Parse legal documents and infer contract terms
  • Convert unstructured documents to structured format for further processing
  • For example, extracting triggers from contracts for monitoring of trigger events

Pattern Recognition

Use Cases in Pattern Recognition

Fraud Detection

  • Credit Card Fraud
  • Correlate electronic communication with trading activities to infer collusion
  • Detect anomalous trading or client activity
  • Can help proactively identify events such as the LIBOR fixing scandal

Customer Segmentation

  • Context-aware CRM solution
  • Segmentation of customer based on customer profile and activity
  • Credit worthiness and automation of the issuance of loans
  • Upsell, Cross-sell opportunities – like recommend banking products based on spending patterns

Production Monitoring

  • Outage prevention by monitoring correlated events
  • Learning from data for past outages
  • Monitor deviation from standard execution patterns

Optimization

Use Cases in Optimization

Collateral Optimization

  • Solution to determine best way to finance an asset on the book based on cost and balance sheet impact
  • Collateral could be used to meet margin requirements at exchange, raise cash from repo markets or used for securities lending
  • Optimization model to also capture impact on balance sheet and Risk Weighted Assets (RWA)

Portfolio Construction

  • Construction of portfolio on efficient frontier with maximal return and minimal risk
  • Generation of alpha signals based on correlation of factor returns
  • Quantitative approach to portfolio construction

Segregation

  • Determination of assets that are most profitable for usage by the bank after applying requirements as per customer protection rules