By Nika Tamaio Flores, Head of Consulting at Data Science UA
Four billion people have access to the Internet. By 2020, each of them will generate 2 MB of data per second. Every minute users make 3.8 million search queries to Google and send approximately 13 million text messages using instant messengers. Companies have been collecting information on sales, wages, and competitors for years. The volume of global data skyrocketed from 33 zettabytes (33,000,000,000 terabytes) last year to 175 zettabytes in 2025.
What is data science?
Data utilisation is one of the critical competitive advantages for modern companies. According to the International Institute for Analytics, by 2020, data-driven companies will bypass competitors who do not use data to make decisions by any less than $430 billion.
The volumes of data accumulated are already so huge that traditional methods are no longer suitable for their evaluation and analysis. Without it, management cannot make competent decisions. So this is how we came to use data science.
Data science is a set of algorithms, approaches, and methods for finding hidden dependencies in acquired data. It is an area of expertise that combines fundamental mathematics, computer science, and domain knowledge.
Automation and analysis of data in banks
The banking sector is actively investing in new technologies: financial sector companies make almost 13% of all investments in data analysis. If banks invest in artificial intelligence or human and machine collaboration at the same rate as technological enterprises, they will increase their incomes by an average of 34% by 2022.
According to the McKinsey Global Institute, 43% of all financial and insurance operations can be automated with existing technologies.
Algorithms are expected to replace 98% of cashiers, brokers, and loan officers. Mass automation will dramatically change current planning and coordinating work, resource allocation, and performance monitoring.
New multidisciplinary jobs will come to replace the fading professions. For example, a business translator is a bridge that connects business customers and technical executives. Such a professional has the knowledge both in the domain and in the application of various algorithms for solving company problems. A business translator can lead technology implementation projects as well as interpret model results.
Accenture interviewed 100 CEOs and top managers of banks about using AI and training employees to interact with machines. 74% of respondents believe that new technologies will completely transform the banking process. At the same time, only 26% noted that 1 out of 4 bank employees have all the necessary skills to work with them — moreover, only 3% plan to invest in retraining staff during the next three years. Bank executives recognise that technology is changing the core banking process, only a few plan to significantly increase investment in retraining employees to implement these technologies in the nearest future.
Data science tasks in banks are not limited to customer data analysis. Modern technologies can solve such tasks as fraud detection, routine tasks automation, risk management, investment portfolio management, employees recruitment, and onboarding as well as many others.
More and more banks are gradually turning into data-driven companies. The lack of data-culture is a stumbling block on the way to this transformation. It is not enough to declare data-based decision-making principles. It is often necessary to reorganise both the technical infrastructure and the thinking of line employees and managers.
Another challenge is legacy systems. Modern technologies for data analysis and processing require a specific structure and quality of data. Most banks carry the cargo of a substantial inheritance such as outdated software, heterogeneous architecture, the low technical level of employees. Accompanied by a lack of documentation, the transition to more advanced systems becomes a nightmare.
Another challenge is the quality and completeness of the data. The terms “garbage in – garbage out” are popular among experts in data. All systems that require manual entry of information are full of errors. For example, ten spellings of the name “Natalya,” an accidentally typed additional zero in sum, etc. Poor-quality data cannot be the basis of high-quality models. A large amount of low-quality data does not improve the situation.
Let us not forget about the challenges of ethics and security. The algorithm inherits the logic of the creator. The creators are faced with questions such as “should I include gender or race when granting a loan?”, “What third-party data can be used to make decisions in the banking sector?”, “How to protect sensitive financial data from leaks?”. Moreover, the main question is: should the world of the future, the world where algorithms make decisions, be just an automatic version of our society or do we want to change it in some way?
Examples of application of data science technologies in the banking sector
Bank of America in 2016 presented a virtual assistant named Erica. It helps more than 45 million bank customers to carry out banking operations in the familiar interface of the messenger. Erica uses natural language processing (NLP) technology to interact with people naturally. It is noteworthy that exactly 2016 was the second most profitable year in the history of Bank of America.
Another American bank, NY Mellon Corp., has been actively engaged in automating routine tasks since 2017: more than two hundred bots are now processing documents, making transfers and correcting employees’ mistakes. As a result, the speed of processing a request for transfer of funds increased by 88%, and the reconciliation procedures instead of 5-10 minutes now take a quarter of a second.
JPMorgan Chase introduced Contract Intelligence (COiN) to process loan applications. The system uses NLP to work through documents. Its use made it possible to reduce the amount of manual labor by 360 thousand hours per year while processing about 12 thousand applications for a loan.
A large South Asian bank once used data science technologies to find factors that affect employee performance. The analysis used the following data: demographic and behavioural data, duration of work, and professional history of employees, information about the departments, and their effectiveness. As a result, common characteristics were found for employees with both high and low efficiency. Also, profiles of workers who have the most significant probability of becoming highly effective were made. Analysis results led to a change in the employees’ rediscovery and new organisational units creation around highly efficient employees. During the year after the introduction of the new system, the bank reported an average of a 26% increase in the productivity of branches and a 14% increase in net income.
One British bank used a specific hiring scheme. According to it, grades, school level, and recommendations had the most significant impact on employment decisions. However, a statistical analysis of the productivity of new employees showed the opposite. It turned out that the most striking predictor of success is the number of spelling and grammatical errors in the candidate’s CV. The fewer mistakes, the higher the efficiency of the employee after hiring. Among other important factors were more visible, for example, successful sales experience in the past. By applying the new system of hiring employees, the bank achieved profit growth of $ 4 million over half a year.
Data science technologies transform the usual business models in most areas, and banks are no exception. Data is the new oil. If in the past, oil companies were the engines of progress, now the world is moving forward those companies that effectively use their data.