How the China Securities Regulatory Commission began to regulate high frequency trading after the 2015 stock market crash

Business & Technology

A translation by Siodhbhra Parkin of a Caixin article on the processes China established in 2015 to control the risks of computerized trading.

CSRC uses “big data” to conduct investigations

10:48:08 2015-11-20 Caixin author: Yue Yue

From the 2015 45th edition of “Caixin Weekly”, published on November 23, 2015

Will the CSRC’s decision to create a central regulatory information platform, update the stock exchanges’ supervisory systems, and other use of “big data” technology be enough to eliminate “rat trading” and insider trading?

After the stock market crash, many market participants were investigated and subject to criminal compulsory measures. However, these significant stock market cases were directly handled not by the China Securities Regulatory Commission (CSRC), but by Public Security Departments. At the same time, the CSRC is also accelerating the pace of investigations. According to people close to the regulatory authority, the CSRC has already collected clues through the data supplied by the stock exchange, and they hope to investigate forthcoming significant cases.  

Stock exchange data has always been the most important source of law enforcement tips provided to the CSRC. In the first half of this year, the Shanghai and Shenzhen Stock Exchange reported 253 actionable investigatory items, accounting for 69% of the total. Especially in recent years, big data monitoring technology developed by the stock exchange has led regulators to many instances of “rat trading” and insider trading. Prior to this, the clues for these kinds of cases were usually discovered through reporting, on-site spot checks, or from the investigation of other cases.

The term “big data technology methods” in fact refers to the data analysis work that has been conducted by the stock exchange all along; while the concept has existed for some time, it is only in recent years that developments in software and hardware have made previously extremely difficult, large-scale data analysis more practical. As stock exchange data exists within a relatively closed system, market supervision departments that previously investigated illegal trading could only look into the account transaction data in its internal network, and were unable to connect that information with the trading behavior exhibited by the user on the external network (the internet). From an analysis of the methods of investigation employed by the CSRC in a series of legal cases, it would appear that there have been a number of breakthroughs in managing these technical difficulties.

For example, in April this year, an account belonging to an individual with the surname Wang, a branch supervisor at a bank in Changsha, used someone else’s account to buy 60,000 shares of Hunan Development (000722.SZ), and released false information about “Hunan Development may fundraise to acquire Caifu Securities” using an account with the codename “A Cloud Like A Cloud” on the Eastern Caifu Stock Platform. In the three trading days following, this, the price of the stock rose 26.3% and resulted in a net profit of more than 13,000 RMB for Wang. This is a typical example of the use of big data in market monitor mechanisms to connect internal and external networks in a legal case.

The Shanghai and Shenzhen stock exchanges are currently in the process of stepping up efforts to upgrade their monitoring systems, and both plan on using big data technology as a trump card; the CSRC’s mission to build a super big data platform – the central regulatory information platform – is also accelerating pace.

Big data “mousetrap”

The Boshi Fund case, involving an individual named Ma Le, is the first big data “mousetrap” case–that is, the first case announced that collected much of its primary evidence from daily monitoring data on the Shanghai Stock Exchange. The question on everyone’s mind is how exactly the CSRC is using the monitoring of big data to identify the network of suspicious fund managers, trace the links connecting managers and accounts, and thereby uncover “rat trading”?

According to information gleaned by Caixin reporters, the Shanghai stock exchange started researching the use of big data to help market regulation as early as 2003. The third generation monitoring system (3GSS) now used was began making use of “mass data analysis” methods to identify insider trading the moment it came online. After more than a decade of upgrades, the 3GSS system now has the capacity to perform functions including transaction monitoring, market replay, generating query summaries, transaction analysis, etc., and is able to not only identify abnormal transactions, but also receive other exchange data to solve problems related to related accounts. By analyzing the data of the whole market, it is possible to directly identify accounts with “abnormal trading behavior.”

In the words of a source from the System Technology Solutions Neusoft Group (600718.SH), “We started cooperating with the Shanghai Stock Exchange many years ago. They are responsible for proposing regulatory requirements, such as specific parameters defining abnormal transactions and insider trading suspicion indicators; we are responsible for software development, building frameworks and model data, and assigning technical personnel at the stock exchange to carry out regular system maintenance and software upgrades. Since there is a confidentiality agreement, the specific deployment process and technical solutions cannot be disclosed.”

A person familiar with this technology explained the basic principles of using data to investigate “rat trading”: the system can identify suspicious name-holders, and through setting an algorithm model, the fund or the account’s trading data will be compared with the trading behavior of the million of accounts nationwide. Where the trading type, time, and other indicators share a high degree of similarity, a suspicious account can be fairly easily identified.

“This algorithm is relatively complex. Sometimes we need to ‘slice’ sampling, and it is necessary to conduct precise analysis from as small as one second to a few months or even a year; sometimes we also need to narrow the scope of data mining, from the brokerage to the business department to the account. Regional connections and identity card information are also considered in the algorithm.” The source further explained that if “rat trading” is conducted using multiple accounts, a more advanced algorithm is needed.

Data relating to the account’s transaction characteristics and capital flow are also important clues in identifying “rat trading.” A source close to the stock exchange revealed that they will generally focus first on where there is short-term manipulation of the account, and whether there were behaviors such as frequent reporting withdrawals, attempts to attract others to acquire a stock, and driving up stock prices during key moments. Secondly, it will investigate historical transaction data to determine the accounts’ operation style and shares preferences, to determine whether there was long-term trading turning into short-term trading, or other abnormal situations where inactive accounts become very active. Finally, it will focused on the related beneficiaries of the account to determine the transaction motive, such as whether the account is related to major shareholders, actual controllers, etc.

Generally speaking, “rat trading” generates high levels of profit, so accounts turning a considerable profit are also often taken as possible leads on improper trading. The way accounts make a profit and settle can be “locked down” by data, allowing investigators to see the original profit-making process. For example, where major shareholders conducts a large transaction with a low price, and the receiving party holds the shares without selling, and only sells the shares through a second-tier market after a sharp rise in stock prices to make a profit, then the receiving party might be worth following up on. Moreover, where accounts frequently buy a large number of shares before the announcement of a high interest rate by companies, the accounts’ precise buying motives and beneficiaries are all important clues.

A source close to the regulators also revealed that in the Shanghai Stock Exchange this year cooperated with the CSRC to carry out a special campaign to fight market manipulation, an effort in which big data again played a large role. Due to the hidden nature of most of these types of illegal behavior, the difficulty and intensity of data mining is much more than what is needed for breakthroughs in general insider trading cases. However, the accuracy of evidence revealed through complex big data analyses are typically quite high.

According to the case handling process, after the submission of case leads by the stock exchange, if the clues are clear, then there will be a case filing and formal investigation; if the clues need to be further verified, then a there will be a preliminary investigation. It is worth mentioning that the reporting of clues also has an unusually strict process. According to the aforementioned sources close to the stock exchange, when the market audit department of the CSRC submit documents relating to case clues, they need to use a confidential machine to send and backup the information.

The Shanghai stock exchange’s “2014 Self-Regulatory Work Report” disclosed that since 2013, it has used big data analysis to carry out verification of the unified standards of relevant securities transactions of fund companies and fighting “rat trading” behavior with increasing precision. The Shanghai Stock Exchange board director Mr. Pan Xuexian revealed during an interview with Xinhua early this year that after adopting data “mousetrap” model, and reporting more than 20 case clues of “rat trading”, the monetary amount involved in these cases rose up to 10 billion RMB.

[IMAGE 1: See translation and reproduction of original at end]

“Next, the Shanghai stock exchange will continue to fully utilize big data mining technology, develop and utilize various types of illegal trading models, investigate and target valuable clues, and implement a regulatory change from a ‘human determination model’ to a ‘technology oriented model,’” Pan Xuexian said.

According to information collected by Caixin reporters, the 3GSS system is expected to continue to expand data dimensions, and integrate more data from deeper levels provided by the China Securities Depository and Clearing Co., Ltd. (hereafter referred to as CSI) and the China Securities Investor Protection Fund Co., Ltd. (hereafter referred to as the “Insurance Fund”). In addition, according to the experience of large data monitoring gleaned in recent years, the Shanghai Stock Exchange has formed a set of key points and standards for the analysis of the various case clues. This has better provided for the development of data analysis models that can be used to determine account trading habits and style.

[IMAGE 2: See translation and reproduction of original at the end of the article]

Uncover people who “grab the hat”

A person close to the Shenzhen Stock Exchange revealed that for the past two years, the stock exchange has used a data analysis system to report clues to the CSRC, with a case filing rate of close to 100%. The Shenzhen Stock Exchange currently runs a fourth generation monitoring system, and it already has an intelligent data analysis subsystems based on big data. These include four transaction analysis platform models, namely, a converging transaction analysis platform, an insider trading analysis platform, a market manipulation analysis platform, and a “grab the hat” trading analysis platform.

It is worth mentioning that “grab the hat” trading (referring to instances where securities industry personnel trade or hold securities, and publicly evaluate, predict or offer investment advice regarding that securities and its issuer, in order to profit from anticipated market fluctuations) has always been hard for market regulators to regulate. However, with big data analysis and monitoring, this type of trade may become more difficult to hide.

“Although the previous system could deal with massive amounts of transactions and registration settlement data, all kinds of text types of unstructured data were processed manually, and so quickly and accurately locating abnormal transactions was a great challenge,” stated the previously mentioned source close to the Shenzhen Stock Exchange. Further, the source noted, many “grab the hat” crime personnel are veterans of the securities industry, and have a strong ability to counter investigations, “in addition to using trading and settlement data, listed companies announcements, online media news, brokerage research reports and other comprehensively analyzed sources of information to find clues.”


If you encounter situations such as those in which there are a large number of accounts, or the transaction characteristics are complex, or the analyst information is not complete, or the use of other people’s account for operations, etc., the difficulty in identifying “grab the hat” trading is even greater.

This kind of work that relies heavily on the analysis of text information cannot be carried out by the current regulatory system of the Shenzhen Stock Exchange. A technical professional familiar with the field said, “The old version of the regulatory system was still analyzing the internal data of the stock exchange; external data was not collected. Regulatory personnel will have to use a lot of effort to switch back and forth from intranet and internet, and in essence it is mostly done through manual visual identification.”

According to information gathered by Caixin reporters, in 2010, the Shenzhen Stock Exchange began researching regulatory strategies focusing on this type of information manipulation. To this end, in fact, the exchange deployed a special team to the United States to learn from practices there, primarily studying the SONAR system of the Financial Industry Regulatory Authority (FINRA). FINRA is an independent regulatory body of Wall Street, whose members are overwhelmingly lawyers and traders. It is known as the behind-the-scenes helper for the United States Securities and Exchange Commission. The SONAR system combines transaction monitoring and news analysis, and uses big data algorithms to quickly identify insider trading behavior.

However, this system cannot be copied directly. The reason is that the United States’ financial data is standardized, and the English language is also more conducive to machine reading. These two conditions are not available in China.

“There are also some differences between Chinese and English in terms of setting up the necessary software systems, and in attempting to implement certain findings we encountered this bottleneck,” Song Liping, general manager of the Shenzhen Stock Exchange has also noted publicly the difficulty in analysis: “Information is very dispersed, so collecting it for data mining and text mining is actually still a little difficult in practice.”

In the Chinese language system, the smallest unit of information is the Chinese character, and different arrangements and combinations have different meanings. Therefore a basic task is “segmentation”, the need to establish a set of algorithms to segment the text information into a machine-readable form, in order to better analyze stock price volatility and correlations.


To break through the bottleneck, the technology mission was led by Shenzhen Stock Exchange’s information management department (also known as the financial innovation lab). Following a collaborative effort between multiple departments, they developed a data analysis technique based on Chinese text mining, and this technique is currently used in “grab-the-hat” transactions, network information monitoring, text information collection, intelligent classification, and many other areas.

“We not only learning from international mainstream alarm methods, but also used indicators in line with the characteristics of China’s securities market, matching the transaction structure and transaction behavior characteristics of investors, to effectively eliminate interference signals and hence the accuracy of early warnings is relatively high.” The aforementioned technical sources said the technology can also carry out similarity, concentration, and correlation analysis.

At present, the surveillance system of the Shenzhen Stock Exchange has a total of 204 alarm indicators, 300 real-time and historical data categories, and can achieve 60 multinomial special investigation and analysis functions. It can handle more than 1 billion transactions per day, and has the peak processing capacity of 25,000 transactions per second, can store data for over 20 years, and has a storage capacity of up to 40 terabytes. Every investor, from the first day of the opening of the account, will have their commission, transaction, trust, and other transaction data included within the regulatory scope of the regulatory system.

The regulatory system also has a strong insider information insider database, which will include information on the company’s directors, supervisors, executives, major shareholders, intermediaries, and relevant persons reported by the listed company, and has designed a series of analysis modules.

For example, in the monitoring of insider trading, the system can carry out comparison between the personnel data and other information disclosed by the listed company with the transaction records of other accounts. If the stock trading is abnormal before the issuance of the information, and where this also touches on the alarm indicator of the system’s insider trading analysis module, the alarm will automatically turn on.

In addition, the Shenzhen Stock Exchange market supervision department also set up a special real-time monitoring team, not only carrying out analysis according to the regulatory system’s real time alarm platform, but also closely monitoring market dynamics, including analysts’ stock reviews, research reports, news and media commentary and information, etc.

According to Caixin reporters’ investigations, in 2014 the Shenzhen Stock Exchange started the construction of a new generation of monitoring systems. The fifth generation of monitoring systems is expected to be launched at the end of 2017, and will comprehensively use regulatory methods including big data and public opinion analysis. It will also increase the monitoring of options and other derivatives, high-frequency trading, program trading, and cross-market and cross-product operation monitoring functions.

In addition, the Shenzhen Stock Exchange market supervision department also set up a special real-time monitoring team, not only carrying out analysis according to the regulatory system’s real time alarm platform, but also closely monitoring market dynamics, including analysts’ stock reviews, research reports, news and media commentary and information, etc.

Central regulatory information platform

Although the big data monitoring technology of the Shanghai and Shenzhen Stock Exchanges respectively have their own merits, they are incompatible with each other, meaning that the data from both exchanges cannot be fully integrated. In addition, systems are repeatedly under construction, information collection is not standardized, and there is no uniform reporting time–common problems of software and hardware produced in this area in recent years.

The CSRC has been interested in creating a large-scale data system not only to combine the data of the exchange and insurance funds, but also to use the benefits of cloud computing to integrate the daily regulatory data of securities commission agencies, agencies, industry associations and regulatory units.

Thus, the construction of the central regulatory information platform was put on the agenda in early 2014.


According to information gathered by Caixin reporters, the platform is composed of public and basic service function modules and business functions modules. The former includes a unified data submission system, central database, external data exchange system, public information feedback dissemination system, etc., while the latter is used to support the CSRC’s administrative licensing, inspection and punishment, daily supervision, macro supervision, internal management, and other work.

In early 2014, Xiao Gang, chairman of the CSRC, spoke at the National Securities and Futures Regulatory Work Conference, stating that “We need to adapt to the era of big data regulatory work on information classification, integration, and mining needs. The construction of this platform is not only a unified integration of the whole system of data and resources, but also the collection and re-organization of the regulating of businesses and processes.”

At present, most of the platform’s projects are carrying out or have completed open tender. A staff member within Wen Si Hai Hui, a technology company that successfully bid for the most core project of the construction of the central database, spoke to Caixin reporters and said that because this is a sensitive time period, they were unable to reveal the public details of the platform.

The business functions of the central regulatory information platform include ten sub-projects, including audit and investigating cases, administrative penalties, corporate supervision, regulating middleman agencies, risk monitoring, statistical analysis, institutional qualification licensing, personnel qualification, product licensing, etc.

Among them, the comprehensive investigation and case management system was the first to be constructed. This includes the four subsystems of clue discovery mechanisms, case management, investigation and analysis, and trial review. Candidates for project bidding include AIU Cupressaceae (300188.SZ), a listed company focusing on electronic data forensics, criminal technology products, and network information security.

According to the original construction plan timetable, the aim was to finish building a central regulatory information platform within three years–that is, before the end of 2016. However, according Caixin, there were some problems during process of building the platform, and some projects have not yet completed initial proceedings, meaning that is it extremely unlikely the platform will be completed on time.

“Some units and departments do not pay enough attention or put forth enough effort, and did not propose their business needs according to schedule, or if they did, the needs were not fully contested and substantiated, and this has affected the progress and quality of the platform construction,” Xiao Gang pointed out at the 2015 National Securities And Futures Supervision Work Conference. “Some department units are self-centered and are used to working with their own systems, creating all kinds of excuses to delay system and data integration. This has resulted in barriers to completion of the basic foundational work of establishing the system.”

After years of business practice, the regulatory system departments have already formed their own business processes and data calculation methods; thus, requiring these various data source units to unify their data reporting mechanism is clearly not a short-term project.

In addition, in setting parameters for the investigation and punishment of offenders, regulation of the securities of listed companies, and the regulation of securities companies and other systems, due to the large number of departments involved and duties unperformed, communication costs are very high.

In accordance with the requirements set early in the year, whereby the first phase of the project concerning the national database platform was to be built within 2015, the aforementioned sources at Wen Si Hai Hui have told Caixin reporters that currently the project has not commenced.

Big data: Not an all-powerful tool

“After all, machines are not people, and merely relying on regulatory indicators is not enough. The results of large-scale data analysis is only a hint and reference,” the exchange source admitted. “Inspectors also need to look information from all dimensions and historical regulatory information and experience, and generate an analysis of the clues using a combination of data analysis and manual investigation.”

Apart from the Shenzhen Stock Exchange regulatory system, which is independently researched and developed, the Shanghai Stock Exchange and the central regulatory information platform system both have outsourced work to technology companies to aid in their construction, raising concerns about data security.

The previously cited Dongran Group’s staff member indicated to Caixin that as a system developer, although the intellectual property of the system belongs to the company, the system is being run by the stock exchange. “It is impossible for us to get the internal data of the stock exchange, and they did not give us any transaction data during the system development and testing stage.”

In addition, with the development of innovative businesses, big data regulatory technology is also facing a lot of challenges.

On the one hand, the ETF options, finance margins, collateral repossession and other business regulatory indicators are relatively complex, with “market value management” and other new types of manipulation being more subtle.

“From the data collected from individual transactions, this kind of account is not controlled within the system, and short-term manipulation does not apply; from trading volume statistics, there does exist situations where the transaction amount is large individual trading days, but looking at transactions of consecutive trading days, daily average data does not fit the stock-retaining manipulation features,” the abovementioned source continued. “A lot of illegal behavior is difficult to analyze and to identify. The more we regulate, the more they try to bypass us. We do not rule out the possibility that some of the more ingenious offenders, with more sophisticated trading techniques, will be able to avoid big data regulation.”

On the other hand, the difficulty in identifying the actual controller of the account is still a problem, even with big data analysis problem. Regulators are still often unable to pierce the veil that obscures the true holders or users of an account, especially when it comes to identifying the actual users of accounts on the QFII, Shanghai and Hong Kong exchange. Similarly, actual beneficiaries of private equity products, or controllers that used to use HOMS and other systems, were similarly difficult to draw out. “This is a cat-and-mouse game. It’s like the enemy in the dark, and I’m in the light, the difficulty of regulating is self-evident,” the source stated.

The individual familiar with market regulatory technology further stated, “The central regulatory information platform is just the integration of data within the securities regulatory system; if the household registration data of public security, bank system billing data, and even the relationship data between mobile communication and social media and so on can be shared across departments and regions, then more clues will surface.”

At the same time, he believes that the higher value of big data is not to discover illegal trading, but to let the computer learn and carry out predictions. “We are in the primary stage, and the system is responding to the given rules. It is being studied and developed to choose the best solution from the rules, to accurate processing of vague information; a high-value computer will learn how to determine the new rules in different scenarios, and can carry out predictions before the occurrence of illegal trading, then that would be a real magic computer.”

However, the legal profession has expressed that big data technology, not matter how strong, is only a temporary measure. The penalties for “rat trading” are quite light, and taking into account the “low-risk, high-yield” mentality of many market players, the deterrent effect of big data is limited. In order to really cure the problem, there should be more drastic measures taken to curb “rat trading” and insider trading.


image2Abnormal trading clues rising to prominence

Sources of clues of China’s securities commission law enforcement efforts for first half of 2015

  • 253 Shanghai and Shenzhen stock exchange abnormal trading clues reported (unit:clues)
  • 54 clues discover by regular regulatory efforts carried out by the commission
  • 42 clues discovered through the national shares transfer system, futures trading exchange, investigation and regulatory measures, complaints, reporting, commission association, and fund associations and other avenues
  • 20 other clues discovered through other department referrals
  • Insider trading cases has seen a sharp increase
  • Nature of cases investigated by China’s securities commission in recent years
  • Illegal information disclosure
  • Insider trading
  • Market manipulation
  • Other cases

Source: China Securities and Regulatory Commision

Image 2

Diagram of the central regulatory information platform

image1 1

[Left-hand black boxes, from top to bottom]

On-site inspection

On-site inspection and case handling

(one-way link by VPN)

Securities companies

Futures companies

Fund companies

Accounting firms

Law firms

Other middlemen agencies

(two-way link by Securities network)

Listed companies

Non-listed companies

(two-way linked by Internet)

Stock exchanges

China Securities Depository and Clearing


(two-way link by Securities network)

News media

Public opinion

(one-way link by Internet)

[Top red boxes, left to right]

Internal departments (linked by intranet)

Established agencies (linked by intranet)

Self-regulatory organizations (linked by Internet)

Management companies (linked by Internet)

[==Large box==]

[Top inner box, black headings and gray functions, left to right]

Business function group

Administrative licensing

Company qualification licensing

Personnel qualification licensing

Product licensing

Investigation and penalizing

Investigation and case handling

Administrative penalties

Daily data analysis

Regulatory measures for middleman agencies

Regulatory measures for companies

Macro regulation

Risk assessment and prediction

Statistical analysis

Internal management

[Red footer] Law and policies information management

[Red side-boxes]

[Left] On-site data collection system (one-way link)

[Right] Public information disclosure feedback system (two-way link)

[Bottom inner box, listed top to bottom]

Public and infrastructure function group

Basic institution and personnel information

Administrative licensing data

Petition and complaint data

Agency operation data

Administrative penalty data

Internet public opinion data

Laws and regulations database

Company disclosure data

Pending cases data

Department exchange data

Statistical data

Transaction settlement data

Other regulatory data (regulating measures, authoritative information, public promises etc.)

[Bottom line] Central database

[Red side-boxes]

[Left] Off-site systems data collection system (two-way link)

[Right] External data exchange system (two-way link)

[Right-hand links]

Internet (two-way links)

Specialized network (two way link to) other departments and regulatory agencies

Source: compiled by Caixin reporters using public information.