CROSS-REFERENCE TO RELATED APPLICATIONSThe present application is a continuation application of and claims priority to U.S. application Ser. No. 17/304,099, filed Jun. 14, 2021, which is a continuation of U.S. application Ser. No. 16/507,633, filed Jul. 10, 2019, now U.S. Pat. No. 11,100,584, which is a continuation application of U.S. application Ser. No. 16/146,643, filed Sep. 28, 2018, now U.S. Pat. No. 10,373,253, which is a continuation application of U.S. application Ser. No. 14/857,219, filed Sep. 17, 2015, now U.S. Pat. No. 10,089,687, which claims the benefit of the earlier filing date of U.S. provisional application 62/200,989 having common inventorship with the present application and filed in the U.S. Patent and Trademark Office on Aug. 4, 2015. The benefit of priority is claimed to each of the foregoing, and the entire contents of each of the foregoing are incorporated herein by reference.
BACKGROUNDThe Securities and Exchange Commission (SEC) has adopted Rule 613 under the National Market System (NMS). Rule 613 requires national securities exchanges and national securities associations, such as self-regulatory organizations (SROs) to implement and maintain a consolidated audit trail (CAT). This audit trail improves the ability of the SEC and the SROs to oversee trading in U.S. securities markets. Current reporting systems are limited in terms of allowing broker-dealers to review their submitted data, which complicates issue resolution and limits the ability of the reporters to mine the data for business intelligence.
SUMMARYIn some embodiments, a network-based order linkage system is provided for maintaining a consolidated audit trail (CAT) of trading events in securities markets. The order linkage system includes a device configured to receive event data for one or more orders based on one or more order characteristics, determine linkages between the one or more orders based on parent relationships of the one or more orders, verify the linkages between the one or more orders based on the event data, and determine order lifecycles based on the linkages between the one or more orders.
In another exemplary embodiment, an associated method includes receiving, via a network, event data for one or more orders based on one or more order characteristics; determining linkages between the one or more orders based on parent relationships of the one or more orders; verifying the linkages between the one or more orders based on the event data; and determining order lifecycles based on the linkages between the one or more orders.
The foregoing general description of the illustrative embodiments and the following detailed description thereof are merely exemplary aspects of the teachings of this disclosure, and are not restrictive.
BRIEF DESCRIPTION OF THE DRAWINGSA more complete appreciation of this disclosure and many of the attendant advantages thereof will be readily obtained as the same becomes better understood by reference to the following detailed description when considered in connection with the accompanying drawings, wherein:
FIG.1 is a block diagram of an exemplary network topology of an order linkage system;
FIG.2 is a functional block diagram of an order linkage system;
FIG.3A is a flowchart of an order identification process;
FIG.3B is a diagram of an order lifecycle linkage;
FIG.3C is a diagram of order lifecycle linkage;
FIG.4 is a diagram of an exemplary order lifecycle;
FIG.5 is a flowchart of an exemplary order linkage process;
FIGS.6A-6D are diagrams of exemplary order lifecycle matrices;
FIG.7 is an is a flowchart of a lifecycle generation process;
FIGS.8A-8C are exemplary diagrams of recursions of the lifecycle generation process;
FIG.9 illustrates a block diagram of an exemplary computing device.
In the drawings, like reference numerals designate identical or corresponding parts throughout the several views.
DETAILED DESCRIPTIONDevices, systems, and methods for creating order lifecycles via daisy chain linkages are configured to determine relationships between executed orders over a predetermined period of time. Orders placed at consolidated audit trail (CAT) reporters such as broker-dealers, exchanges, and the like, are tracked as the orders are routed, combined with other orders, or filled. The daisy chain linkages between the orders can be determined independently of a sequence in which the orders are received, and the relationships between the orders are based on determining parent-child relationships between the one or more orders. Throughout the disclosure, the daisy chain linkages are interchangeably referred to as “linkages.”
FIG.1 is a block diagram of an exemplaryorder linkage system100, including one or more processing systems (e.g., processors, computing devices, servers, virtual machines, parallel processing threads, etc.), such as a consolidated audit trail (CAT)processor106 with processing circuitry that is configured to receive event data for one or more orders from one ormore CAT reporters114 via anetwork104 in order to determine relationships between the orders over a predetermined period of time. The CATprocessor106 detects errors in event reporting based on the determined relationships between the orders and outputs error reports to theCAT reporters114. The CATprocessor106 can include one or more servers126, databases, and/or other computer hardware associated with determining, processing, and storing the event data and the order lifecycles for the one or more orders. In some implementations, the processes of the CATprocessor106 can be implemented in a cloud computing environment102 in order to provide increased scalability of an amount of data processed by the CATprocessor106.
The cloud computing environment102 may include one or more resource providers, such as the servers126, CATprocessor106, and the like. Each resource provider may include computing resources. In some implementations, computing resources may include any hardware and/or software used to process data. For example, computing resources may include hardware and/or software capable of executing algorithms, computer programs, and/or computer applications. In some implementations, exemplary computing resources may include application servers and/or databases with storage and retrieval capabilities. Each resource provider may be connected to any other resource provider in the cloud computing environment102. In some implementations, the resource providers may be connected over thenetwork104. Each resource provider502 may be connected to the computing devices116 over thenetwork104.
The cloud computing environment102 may include a resource manager. The resource manager may be connected to the resource providers and the computing devices116 over thenetwork104. In some implementations, the resource manager may facilitate the provision of computing resources by one or more resource providers to the computing devices116 associated with theCAT reporters114, data reporting entities, and/or regulatory agencies. The resource manager may receive a request for a computing resource from a particular computing device116. The resource manager may identify one or more resource providers capable of providing the computing resource requested by the computing device116. The resource manager may select a resource provider to provide the computing resource. The resource manager may facilitate a connection between the resource provider and a particular computing device116. In some implementations, the resource manager may establish a connection between a particular resource provider and a particular computing device116. In some implementations, the resource manager may redirect a particular computing device116 to a particular resource provider with the requested computing resource.
In one implementation, the cloud computing environment102 may include GOOGLE Cloud Platform™. The processes associated with determining linkages between the one or more orders can be executed on a computation processor, such as the GOOGLE Compute Engine. TheCAT Processor106 can also include an application processor, such as the GOOGLE App Engine, that can be used as the interface with theCAT reporters114 to receive the event data and output the error reports. TheCAT processor106 also includes one or more databases, such as cloud storage and a query database. In some implementations, the cloud storage database, such as the GOOGLE Cloud Storage, stores processed and unprocessed event data. The query database, such as the GOOGLE BigQuery platform, can store processed data for a predetermined period of time that is accessible via queries from regulatory staff associated with self-regulatory organizations (SROs)118 as well as the Securities and Exchange Commission (SEC).
TheCAT reporters114 can include broker-dealers, the financial industry regulatory authority (FIRNA), and other entities associated with order routing and/or linkage. TheCAT reporters114 report instances of orders being placed, routed, or filled to theCAT processor106 that include the event data associated with each order, such as symbol, number of shares, time of fill, resulting action (e.g., routed, filled), and the like. In some implementations, theCAT reporters114 can be configured to automatically output the event data for the orders to theCAT processor106 when the orders are executed. TheCAT reporters114 can also be configured to manually output the event data to theCAT processor106. TheCAT reporters114 each have a unique identification code that can be used to identify the orders being linked by theCAT processor106. TheCAT reporters114 can also access the error reports output by theCAT processor106 via an interface at one or more computing devices that can include a computer116,mobile device116b, and one or more intermediate computing systems116c. In some examples, broker-dealers can access the error reports via a web interface, handheld device application, or telephone interface.
Theorder linkage system100 also includes one or more other data reporting entities that provide data to theCAT processor106 that is used to link orders as well as validate the event data and order linkages. TheCAT processor106 also outputs updated data to the data reporting entities upon receiving error corrections from theCAT reporters114. For example, the SROs/exchanges118 report market data to theCAT processor106 and can also query order linkage information from the query database. In addition, the securities and information processor (SIP) provides equity, options, and market data to theCAT processor106. TheCAT processor106 also monitors the options clearing corporation (OCC)110 to maintain current and historical symbologies for equities. Data provided by theOCC110 may include exercise, assignment, and clearing member trade agreement (CMTA) information. Trade reporting facilities (TRFs) and alternative display facilities (ADFs) providetrade data108 to theCAT processor106.
Thenetwork104 represents one or more networks, such as the Internet, connecting the cloud environment102,CAT processor106, theCAT reporters114, and the other data reporting entities, such as the exchanges/SROs118, theSIP112, theOCC110, and the TRF/ADF108. Thenetwork104 can also communicate via wireless networks such as WI-FI, BLUETOOTH, cellular networks including EDGE, 3G and 4G wireless cellular systems, or any other wireless form of communication that is known. In one implementation, thenetwork104 is agnostic to local interfaces and networks associated with theCAT reporters118, cloud environment102, and other the data reporting entities to allow for integration of the local interfaces and networks configured to perform the processes described herein.
FIG.2 is a functional block diagram200 of anorder linkage system100. In some implementations, the processes executed by theCAT processor106 are associated with adata repository layer202 or anapplication layer204. In some implementations, the processes of thedata repository layer202 includeprocessing reference data206, data interface andcommunication208, error handling210, andevent processing212. For example, processing thereference data206 can include receiving reference data associated with order lifecycles being generated from the data reporting entities, such as the exchanges/SROs118, theSIP112, theOCC110, and the TRF/ADF108. TheCAT processor106 processes the reference data to determine properties and characteristics of executed orders that can be used to link and validate the orders reported by theCAT reporters114.
The data interface andcommunication function208 of thedata repository layer202 can include communication of theCAT processor106 with theCAT reporters114 and the other data reporting entities of theorder linkage system100 to receive the event data and/or reference data and also output error reports and other reports associated with the order linkages. The error handling210 includes detecting errors in the order linkages based on linkage verification data that is produced when the order linkages are determined. In addition, theevent processing212 includes processing the event data for the one or more orders received from theCAT reporters114 to determine the relationships between the orders.
In addition, the processes performed at theapplication layer204 can includepresentation services214,data distribution216, reportingpresentation218, andother support220. The processes performed at theapplication layer204 are associated with how the data is presented to and received from users at theCAT reporters114 as well as regulatory staff at theSROs118 and the SEC. For example, through the presentation services function214, theCAT processor106 manages the interface through which theCAT reporters114 communicate with theCAT processor106 to provide the event data, receive error reports associated with the order lifecycles, and provide responses to the error reports. Thereporting presentation function218 is associated with providing CAT reports to theSROs118 and the SEC based on queries received from the regulatory agencies. Thedata distribution function216 is associated with providing updated trade and market data to data reporting entities when error corrections from theCAT reporters114 are received by theCAT processor106. Thesupport function220 is associated with providing technical support to theCAT reporters114, data reporting entities, and regulatory agencies that communicate with theCAT processor106.
FIG.3A is a flowchart of anorder identification process300. According to some implementations, when theCAT processor106 receives event data for an order, a unique order identifier (Order ID) is assigned to the order that is used to link parent orders to child orders when an order lifecycle is developed. As will be discussed further herein, the Order ID associated with each order is used as a row key for an order lifecycle matrix that delineates the event data, parent orders, child orders, and verification data for each order of the order lifecycle.
At step S302, theCAT processor106 receives a notification from a task queue indicating that one or more orders are available for processing. TheCAT processor106 receivesinput records318 for the orders, and parses each file of the input records318 to extract the event data for the orders, such as the type of order, time or order, number of shares, symbol, customer, CAT reporter, and the like. The processing circuitry of theCAT processor106 performs initial validation of the event data and identifies valid and invalid orders. For example,invalid orders320 can include orders that are missing one or more predetermined components of event data. For valid orders, theCAT processor106 transforms the event data into a predefined standard format. For example, the standard format can include CAT Reporter/Order Event/Receiving Firm ID/Originator ID/Order ID/Price/Shares/Quantity.
At step S304, the processing circuitry of theCAT processor106 partitions the event data into classifications, also referred to as buckets, based on a key associated with an originator identification (ID). In some implementations, the originator is a participant that places an order. The partitioning of the event data can be performed via one or more MapReduce processes. If the order associated with the event data is a new order that has not been received by theCAT processor106, the partitioning produces a null value and is classified as anew trade order324. If the originator ID is identified by theCAT processor106 that is associated with an existing order, the processing circuitry classifies the order according to theprior participant ID326. In some implementations, when the originator ID is identified, a current order being processed may be a child of one or more previously received parent orders.
At step S308, theCAT processor106 creates and assigns a CAT Order ID to the new trade orders324. In some implementations, the CAT Order ID includes an identification code associated with the CAT reporter reporting the order. At step S306, when a previously created originator ID is identified, theCAT processor106 searches one or more data structures, such asBloom filters328, based on originator ID and originator order ID to determine whether a parent order has had a CAT Order ID assigned. If the parent order has not had a CAT Order ID assigned, then the processing circuitry of theCAT processor106 adds the order to a retryfile322 and updates an order count.
At step S310, for each of the found files, the processing circuitry of theCAT processor106 retrieves the CAT Order ID for the parent order and assigns the same CAT Order ID to the child order so that the parent and child orders are linked via the CAT Order ID. At step S312, the CAT Order IDs assigned to thenew trade orders324 at step S308 and the CAT Order IDs retrieved for the prior participants at step S310 are stored in at least one database. In some implementations, the database can include thequery database330, such as such as the GOOGLE BigQuery platform, that can store processed data for a predetermined period of time that is accessible via queries from regulatory staff associated with self-regulatory organizations (SROs)118 as well as the Securities and Exchange Commission (SEC).
At step S314, if the CAT Order ID for the parent order is found, then the current order is added to a found file, and the Bloom filters328 are updated to reflect the order as a child order of the identified parent order. In addition, for thenew trade orders324 theCAT processor106 creates correspondingBloom filters328 based on the originator ID and the originator order ID. At step S316, theCAT processor106 updates the cloud storage database with the updatedBloom Filters328. In some implementations, the cloud storage database, such as the GOOGLE Cloud Storage, stores processed and unprocessed event data.
FIGS.3B and3C are exemplary diagrams of an order lifecycle linkage. The order linkage includes one or more entities, such as theCAT reporters114, that receive, route, and/or execute orders. For example, an order fromClient A332 is electronically routed to Firm A334, which has an Order ID of CA1234.Firm A334 receives the order fromClient A332 and creates a new order in an order management system (OMS) with a unique OMS-generated ID, which is represented by the System Order ID. In some implementations, a Parent Order ID is also generated to allow for child order to be linked to parent orders. The System Order ID is then mapped to the Received Order ID fromFirm A334.Firm A334 then generates two separate child/routed orders for the received orders and routes a first child order to Exchange A336 and a second child order to FirmB342.
Exchange A336 receives the first child order fromFirm A334 and generates a new order with a unique Order ID, which is mapped to the received order fromFirm A334. The Exchange A then publishes the order/quote message for the order to theSIP338 for the first child order received from Firm A. The first child order fromFirm A334 is executed atExchange A336, and the trade is published to theSIP338.Firm B342 receives the second child order fromFirm A334 and generates a new order with a unique Order ID, which is mapped to the Received Order ID for the second child order fromFirm A334. Firm B then executes the second child order fromFirm A334 internally and publishes execution/trade details toFINRA TRF340. TheFINRA TRF340 then passes the Execution ID for the second child order to theSIP338. In addition, the Execution ID is also mapped to the corresponding Received Order ID.
As CAT events are submitted to theCAT processor106 from theCAT reporters114, order linkages are created based on information associated with the events, and a unique CAT Order ID is assigned to related events. For example, the order linkages can be based on at least one of the CAT reporter, Received Order ID, System Order ID, Routed Order ID, Destination ID, Sending Firm ID/method, and the like.
FIG.4 is a diagram of anexemplary order lifecycle400. Order lifecycles delineate relationships between orders that include parent-child relationships. For example, a parent order is an order that is routed to another venue, split into two or more orders, or combined with another order. A child order is an order that is produced from one or more parent orders. For example, a child order can result from the routing, splitting, or combining of one or more parent orders. Each circle of theorder lifecycle400 represents a new order, a routed order, or a filled order. In some implementations, a new order is an order having no corresponding parent orders. One example of a new order is when a customer places an order on the stock market to buy a number of shares of a stock at market price.
Theorder lifecycle400 commences with new orders A and B, which have no parent orders. For each order, theCAT reporter114 can fill (or execute) the order to complete the transaction, break apart and/or combine the order with one or more other orders, or route the order to anotherCAT reporter114. Routed orders, such as orders C, D, E, F, H, and I in theorder lifecycle400, have both corresponding parent and child orders. Filled orders have no corresponding child orders, such as orders G and J in theorder lifecycle400. Each order received by theCAT processor106 includes corresponding event data that can include symbol, price, time, type of order, customer, CAT reporter, and the like. When the orders are received at theCAT processor106, the corresponding parent orders are known, but the child orders may not be known. In addition, the orders may have more than one parent order and more than one child order. As will be discussed further herein, the processing circuitry of theCAT processor106 determines the child orders as orders are received during an order linkage process.
The new order A is split into two orders, with one split being routed as order C, and the other split combined with the new order B to become order D. The order C is further routed as order E, and the order D is further routed as order F. The order E is then split into two orders, with one split corresponding to order G, which is a filled order, and the other split being combined with a split from order F to form order H. Another split of order F is routed as order I. In some implementations, orders H and I are routed to the same CAT reporter and are combined into order J, which is filled. According to some embodiments, the orders D and H may also be classified as new orders because the orders D and H are formed from a pooling of shares from other orders and are aggregates of multiple events.
FIG.5 is a flowchart of anorder linkage process500. According to certain embodiments, theCAT processor106 performs multiple iterations of theorder linkage process500 to process events within a predetermined period of time. For example, in one implementation, theCAT processor106processes 100 billion events within four hours to determine an order lifecycle for the events and determine whether errors exist in the event data and/or the order lifecycle. In addition, the predetermined period of time for processing the events is based on providing theCAT reporters114 an opportunity to correct any errors identified during theorder linkage process500. In the present disclosure, theorder linkage process500 is discussed in reference to theorder lifecycle400. TheCAT processor106 can commence iterations of theorder linkage process500 as soon as event data for the orders is received by theCAT processor106 and before the event data for all of the orders has been received. The linkages are based on the parent-child relationships between the orders, and theorder linkage process500 is scalable to accommodate any number of orders and linkages having any number of parent and/or child orders.
In some implementations, theCAT processor106 can determine an order lifecycle matrix representing the relationships between the orders independent of the sequence in which the orders are received. For example, theCAT processor106 may receive the orders from theorder lifecycle400 in the following sequence: H, E, G, A, J, I, D, C, B, F. The processing circuitry can determine the relationships between the orders based on the parent relationships, which are independent of the sequence in which the orders are received. In some implementations, the event data and associated linkage data for the orders are written to the lifecycle matrix in a Hbase format. In some implementations, the event data and linkage data are manipulated in a scalable data storage table that supports high read and write throughput at low latency.
At step S502, theCAT processor106 receives event data for a next event of an order lifecycle. Throughout the disclosure, events can be interchangeably referred to as orders. For example, theCAT processor106 may first receive the event data for order H from theorder lifecycle400. TheCAT processor106 receives the event data for subsequent orders with subsequent iterations of theorder linkage process500. In some implementations, the event data for each received order includes the type of order, time or order, number of shares, symbol, customer, CAT reporter, and the like. The event data can also identify corresponding parent order. According to certain embodiments, incoming events are parsed into a predetermined CAT processor format.
At step S504, the processing circuitry of theCAT processor106 writes the event data to a lifecycle matrix associated with theorder lifecycle400.FIG.6A includes an exemplary diagram of alifecycle matrix600 when order H is received according tosequence604. The order can be identified by a row key that is a unique identification code associated with the order. In some implementations, the row key includes at least one of a CAT reporter ID and the Order ID. Thelifecycle matrix600 also includes one or more column families to which the event data, linkage data, and/or verification data is written. For example, the event data received at step S502 is written to the “data” column family of thelifecycle matrix600.
In certain embodiments, the linkage data includes information pertaining to the parent-child relationships of the orders, and the verification data includes information that is used to validate linkages of theorder lifecycle400. For example, the verification data column family can include new order, fill, and exists columns that indicate one or more properties of the order. For example, when the order H is received by theCAT processor106, the processing circuitry fills the “exists” column of the verification data for order H. Throughout the disclosure, filling a column of thelifecycle matrix600 refers to assigning a value of one to the entry. If the entry of thelifecycle matrix600 is not filed, then the entry for the column family is null.
Referring back toFIG.5, at step S506, theCAT processor106 identifies one or more parent orders of a current order. The linkage data for the events of theorder lifecycle400 can be represented in thelifecycle matrix600 by a parent column family and a children column family. The parent and children column families include columns corresponding to each of the orders of the order lifecycle. The processing circuitry of theCAT processor106 can identify the parent orders for the event based on the event data. For example, theCAT processor106 can determine whether a parent order and the current order are associated with acommon CAT reporter114, whether the parent order and the current order have a common beneficial customer, and/or whether the parent order is associated with any of theCAT reporters114 or a non-reporting customer. The order H has parent orders C and D, so the C and D columns of the parent column family are filed for order H. As shown inFIG.6A, thelifecycle matrix600 can be updated to reflect the parent orders E and F, as shown by thelifecycle matrix602. TheCAT processor106 adds the row keys for orders E and F, and updates the children column family for orders E and F to reflect that order H is a child order. Since the orders E and F have not been received by theCAT processor106, the event data, parent data, and verification data column families are null for the orders E and F.
At step S508, the processing circuitry of theCAT processor106 determines whether all of the orders for theorder lifecycle400 have been received. If the processing circuitry determines that all of the orders for theorder lifecycle400 have not been received, resulting in a “no” at step S508, then theorder linkage process500 returns to step S502 to receive the event data for the next event. Otherwise, if the processing circuitry determines that all of the orders for theorder lifecycle400 have been received, resulting in a “yes” at step S508, then step S510 is performed. In the example described herein, order F is the last event of theorder lifecycle400 received by theCAT processor106. Therefore, for each order that is received prior to the order F, the determination at step S508 results in a “no,” and theorder linkage process500 returns to step S502 to receive the next event of theorder lifecycle400.
For example,FIGS.6B-6D represent the lifecycle matrices for subsequent iterations of theorder linkage process500.FIG.6B represents alifecycle matrix606 when the event data for order E is received after order H. Because the row key for the order E was generated as a parent or for order H, theCAT processor106 updates the event data column family with the event data for order E as well as the exists column of the verification data column family. The parent column family is also updated to show that order C is a parent order of the order E. As shown inFIG.6B, thelifecycle matrix606 can be updated to reflect the parent order C, as shown by thelifecycle matrix608. TheCAT processor106 adds the row key for order C, and updates the children column family for order C to reflect that order E is a child order. Since the order C has not been received by theCAT processor106, the event data, parent data, and verification data column families are null for the order C in thelifecycle matrix608.
FIG.6C represents alifecycle matrix610 when the event data for order G is received after order E. TheCAT processor106 adds a row key for the order G and updates the event data column family with the event data for order G as well as the exists column of the verification data column family. In addition, because the order G is a fill order, the fill column of the verification data column family is updated to reflect the order G as a fill order. The parent column family is also updated to show that order C is a parent order of the order E, and the child column family for the order E is updated to show that the order G is a child order of the order E.
FIG.6D represents anorder lifecycle matrix612 when a final event is received at theCAT processor106. For example, for theorder lifecycle400, order F is the final event received by theCAT processor106. Entries of thelifecycle matrix612 are independent of the order in which the events are received. The lifecycle matrix includes the event data, linkage data, and verification data for orders A, B, C, D, E, F, G, H, I, and J. For example, since theCAT processor106 received the event data for all of the events for theorder lifecycle400, the event data column for each event of thelifecycle matrix612 is filled, and the exists column of the verification data is also filled for each of the events. The linkage data for the parent column family and the child column family is also included for each of the orders of theorder lifecycle400. In addition, the verification data column family indicates that the orders A and B are new orders, and the orders G and J are fill orders.
Referring back toFIG.5, at step S510, the processing circuitry of theCAT processor106 performs a linkage verification of thelifecycle matrix612 associated with theorder lifecycle400. In some implementations, theCAT processor106 performs verification of the order linkages as orders are received. TheCAT processor106 can also perform a batch linkage verification for all of the orders of theorder lifecycle400 that have been received prior to an expiration time, and the events of the batch order verification can be processed in parallel. Linkages for late orders received after the expiration time can also be processed, and in some cases, theCAT processor106 outputs a notice to regulators when orders are received after a predetermined time. When theCAT processor106 detects errors, error reports are output to theCAT reporters114 and other data reporting, and theCAT processor106 receives and processes the error corrections that are sent in response to the error reports.
For example, the processing circuitry can detect a linkage error in thelifecycle matrix612 when the exists column of the verification data column family for one or more of the row keys is null. Linkage errors can also be detected when the new order column of the verification data column family is filled, and the corresponding order has at least one parent order. In addition, the linkage errors can also be detected when the fill order column of the verification data column family is filled, and the corresponding order has at least one child order. In addition, an error is detected any order that does not have any parent order or a child order. The processing circuitry of theCAT processor106 can also detect linkage errors in thelifecycle matrix612 by comparing one or more elements of the event data for parent and child orders. In other implementations, thelifecycle matrix612 can include additional verification data columns and/or column families that can be used by the CAT processor to validate linkage relationships between the orders.
Orders can also be rejected by theCAT processor106 if the event data is missing one or more data elements or is not provided in a predetermined format (e.g., syntax and context checks). For example, an order may be rejected if the event data indicates that the order type is a limit order but does not have a populated price field in the event data. Once the errors in theorder lifecycle matrix612 have been identified, theCAT processor106 can issue error reports to theCAT reporters114 so that theCAT reporters114 can have an opportunity to correct the errors. According to certain embodiments,CAT processor106 communicates the detected errors and order rejections theCAT reporters114 via file transfer protocol (FTP) flat file via predefined rejection codes and events, which can be viewed via email and/or a CAT user interface (UI) at theCAT reporter114 associated with the error or rejection.
As theCAT processor106 performs the linkage verification of thelifecycle matrix612, statuses are assigned to each of the events associated with a row key of thelifecycle matrix612. For example, a rejected status indicates that the event failed a format validation and/or an error was detected. A pending status indicates that a rejected event and rejection code has been output to theCAT reporter114. An accepted event indicates that no errors were detected with the event and that the event was accepted by theCAT processor114. A corrected event status indicates that theCAT reporter114 issued a correction for a previously submitted event. A deleted event status indicates that the event has been requested to be deleted by theCAT reporter114. According to certain embodiments, theCAT reporters114 have a predetermined response time to respond to the error reports for theorder lifecycle400, and theCAT processor106 can issue notifications to theCAT reporters114 when the predetermined response time has expired. In some implementations, the linkage errors detected for the one or more orders may not cause subsequently received events to also be rejected. For example, an order linkage that is broken by an error can be repaired without having to resubmit the subsequent events. Once the errors have been repaired by theCAT reporters114 and/or the predetermined response time has expired, theCAT processor106 can output the lifecycle matrix to the cloud storage and/or query database, which can be accessed by the regulatory agencies. TheCAT processor106 can also output thelifecycle matrix612 directly to the regulator agencies via email, web user interface, and the like.
In some implementations, The linkages between the orders of the order lifecycle determined at theorder linkage process500 are stored in one or more analytical data stores in the query database of theCAT processor106 to facilitate queries based on one or more criteria. For example, a CAT Event ID store holds every market event (e.g., orders, cancellations, routes, quotes, etc.) as a row where each event has a unique CAT Event ID, which allows theCAT processor106 to receive queries for all types of orders while simultaneously allowing targeted queries on individual event types.
FIG.7 is a flowchart of alifecycle generation process700. Once thelifecycle matrix612 has been constructed via theorder linkage process500, theCAT processor106 can construct theorder lifecycle400 from thelifecycle matrix612. In some implementations, theCAT processor106 can build order lifecycles on demand in response to a query from theCAT reporters114 and/or the regulatory agencies. Lifecycles and orders can be queried prior to the order lifecycles built, but thelifecycle generation process700 generates the order lifecycles through a parallelized algorithm.
At step S702, the processing circuitry of theCAT processor106 scans the column families of thelifecycle matrix612 to identify new orders and/or fill orders. According to certain embodiments, theCAT processor106 builds theorder lifecycle400 by first locating either the new orders or the fill orders of the order lifecycle by scanning the column families of thelifecycle matrix612. For example, the processing circuitry may scan the new order column and/or the fill order column of the verification column family to identify the new orders or the fill orders. The processing circuitry can also scan the parent column family and/or the child column family to identify the new orders or fill orders. For example, when locating new orders, the processing circuitry can scan the parent column family to identify the row keys for orders with no parent orders. Likewise, when locating fill orders, the processing circuitry can scan the children column family to identify the row keys for orders with no child orders. For one implementation described herein, the CAT processor builds the order lifecycle by associating all the events of the order lifecycle with the new orders. Therefore, the processing circuitry identifies orders A and B as the new orders of theorder lifecycle400.
At step S704, once the new orders and/or fill orders are identified, theCAT processor106 identifies the parent and/or child orders to determine linkages of theorder lifecycle400. TheCAT processor106 performs a graph walk of thelifecycle matrix612 to identify the child orders of the new orders A and B. For example, in the case of the new order A, theCAT processor106 scans the children column family of thelifecycle matrix612 and identifies orders C and D as child order of the new order A. TheCAT processor106 then identifies the children of orders C and D and order A is written as a history variable to an additional column family of thelifecycle matrix612. For example, the order A is added to a history column family for order C when the order E is identified as a child order of the order C. In addition, when the order E is pulled as a child order of order C, orders C and A are the history variables of order E. When the order G is pulled as a child order of order E, orders E, C, and A are passed the history variables of order G, which can be written to the history column family for order G. Since order G is a fill order, there are no further child orders to pull.
At step S706, theCAT processor106 continues to perform recursions to produce a complete order lifecycle by rolling up the linked orders from an end of the recursion where the fill orders are identified. In some implementations, a complete order lifecycle refers to an order lifecycle that identifies all new orders and all fill orders with every associated parent and/or child order. The CAT processor performs one or more recursive operations to determine the order linkages associated with the new orders. For example, once the order G has been identified as a fill order during the graph walk starting from new order A, the recursion returns to order E to attempt to identify, or pull, order H. If order H has not yet been received by theCAT processor106, order E can be flagged to indicate a break in the graph walk at order E. Information associated with order G can then be written to history column family for order E, and information for orders G and E can be written to the history column family for order C. The information for order D and the events subordinate to the order D are also identified via performance of thelifecycle generation process700. If the order H has not yet been received, the order F can also be flagged to indicate the break in the graph walk. In addition, the CAT processor can output late notices to theCAT reporters114 to trigger theCAT reporters114 to report missing events.
The information for orders C, E, G, D, and F as well as information associated with the breaks in the graph walk at orders E and F are written to order A. TheCAT processor106 can reattempt to pull the child orders associated with the breaks at orders E and F. If the order H has been received, the graph walk continues until orders I and J are also pulled. As subsequent events are identified, the history column family for preceding parent orders is updated.
In some implementations, The lifecycles built through thelifecycle generation process700 are stored in one or more analytical data stores in the query database of theCAT processor106 to facilitate queries based on one or more criteria. For example, a CAT Lifecycle ID data store allows regulators to submit a CAT Event ID and receive an output of an entire order lifecycle with every market event linked to the submitted CAT Event ID and may include additional information associated with the events of the lifecycle. In addition, for a query of a CAT Lifecycle ID, theCAT processor106 returns the CAT Event IDs for all of the events associated with the CAT Lifecycle ID. In some implementations, the query database may also include a CAT Order ID data store, which allows a regulator to submit a CAT-Event-ID and receive an output of a sequence of all market events between the queried event and all executions and/or initial orders.FIGS.8A-8C are exemplary diagrams of recursions of thelifecycle generation process700 where a MapReduce function is used to identify relationships between the events of theorder lifecycle400.FIG.8A illustrates afirst iteration map800 of thelifecycle generation process700. As shown inFIG.8A, each event of theorder lifecycle400 can include a key-value (KV) pair where the key is the parent event (or itself if there is no parent event) and the value is the order ID. Theiteration map800 can be reduced so that the orders associated with a particular parent order are included in the KV pair. Theorder lifecycle400 can be built in N−1 iterations, where Nis a number of edges associated with a longest traversal of theorder lifecycle400.FIG.8B is an exemplary diagram of asecond iteration map802, andFIG.8C is an exemplary diagram of athird iteration map804 of thelifecycle generation process700. For the subsequent iterations, each previous key includes the parent order as well as the previous key and values as a value.
Note that each of the functions of the described embodiments may be implemented by one or more processing circuits. A processing circuit includes a programmed processor (for example,processor900 ofFIG.9), as a processor includes circuitry. A processing circuit/circuitry may also include devices such as an application specific integrated circuit (ASIC) and conventional circuit components arranged to perform the recited functions. The processing circuitry can be referred to interchangeably as circuitry throughout the disclosure. In addition, when the processors in each of the servers are programmed to perform the processes described herein, they become special-purpose order linkage devices. The processes performed by theCAT processor106 are computationally rigorous due to the large amount of data that is processed as the order linkages are determined for the order lifecycles. For example, in one implementation, theCAT processor106processes 100 billion events within four hours to determine an order lifecycle for the events and determine whether errors exist in the event data and/or the order lifecycle. In addition, one or more servers126 associated with theCAT processor106 of theorder linkage system100 can perform the processes described herein in parallel to increase processing speed and efficiency.
FIG.9 shows an example of acomputing device950, such as theCAT processor106, that can be used to implement the techniques described in this disclosure. The computing device is intended to represent various forms of digital hardware, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The components shown here, their connections and relationships, and their functions, are meant to be examples only, and are not meant to be limiting.
Thecomputing device950 includes aprocessor900, amemory902, astorage device904, a high-speed interface912 connecting to thememory902 and multiple high-speed expansion ports616, and a low-speed interface910 connecting to a low-speed expansion port914 and thestorage device904. Each of theprocessor900, thememory902, thestorage device904, the high-speed interface912, the high-speed expansion ports916, and the low-speed interface910, are interconnected using various busses, such as communication bus926, and may be mounted on a common motherboard or in other manners as appropriate.
Theprocessor900 can process instructions for execution within thecomputing device950, including instructions stored in thememory902 or on thestorage device904 to display graphical information for a GUI on an external input/output device, such as adisplay908 coupled to the high-speed interface912. In other implementations, multiple processors and/or multiple buses may be used, as appropriate, along with multiple memories and types of memory. Also, multiple computing devices may be connected, with each device providing portions of the necessary operations (e.g., as a server bank, a group of blade servers, or a multi-processor system). Thememory902 stores information within the computing device. In some implementations, thememory902 is a volatile memory unit or units. In some implementations, thememory902 is a non-volatile memory unit or units. Thememory902 may also be another form of computer-readable medium, such as a magnetic or optical disk.
Thestorage device904 is capable of providing mass storage for thecomputing device950. In some implementations, thestorage device904 may be or contain a computer-readable medium, such as a floppy disk device, a hard disk device, an optical disk device, or a tape device, a flash memory or other similar solid state memory device, or an array of devices, including devices in a storage area network or other configurations. Instructions can be stored in an information carrier. The instructions, when executed by one or more processing devices (for example, processor900), perform one or more methods, such as those described above. The instructions can also be stored by one or more storage devices such as computer- or machine-readable mediums (for example, thememory902, thestorage device904, or memory on the processor900).
The high-speed interface912 manages bandwidth-intensive operations for thecomputing device950, while the low-speed interface910 manages lower bandwidth-intensive operations. Such allocation of functions is an example only. In some implementations, the high-speed interface912 is coupled to thememory902, the display908 (e.g., through a graphics processor or accelerator), and to the high-speed expansion ports916, which may accept various expansion cards (not shown). In the implementation, the low-speed interface910 is coupled to thestorage device904 and the low-speed expansion port914. The low-speed expansion port914, which may include various communication ports (e.g., USB, Bluetooth, Ethernet, wireless Ethernet) may be coupled to one or more input/output devices, such as a keyboard, a pointing device, a scanner, or a networking device such as a switch or router, e.g., through a network adapter.
Thecomputing device950 also includes anetwork controller906, such as an Intel Ethernet PRO network interface card from Intel Corporation of America, for interfacing with anetwork104. As can be appreciated, thenetwork104 can be a public network, such as the Internet, or a private network such as an LAN or WAN network, or any combination thereof and can also include PSTN or ISDN sub-networks. Thenetwork104 can also be wired, such as an Ethernet network, or can be wireless such as a cellular network including EDGE, 3G and 4G wireless cellular systems. The wireless network can also be Wi-Fi, Bluetooth, or any other wireless form of communication that is known.
Although the computing device ofFIG.9 is described as having astorage medium device904, the claimed advancements are not limited by the form of the computer-readable media on which the instructions of the inventive process are stored. For example, the instructions may be stored on CDs, DVDs, in FLASH memory, RAM, ROM, PROM, EPROM, EEPROM, hard disk or any other information processing device with which the computing device communicates.
In other alternate embodiments, processing features according to the present disclosure may be implemented and commercialized as hardware, a software solution, or a combination thereof. Moreover, instructions corresponding to theorder linkage process500 and/orlifecycle generation process700 in accordance with the present disclosure could be stored in a portable drive such as a USB Flash drive that hosts a secure process.
Computer programs (also known as programs, software, software applications or code) associated with the processes described herein, such as theorder linkage process500 and/orlifecycle generation process700, include machine instructions for a programmable processor, and can be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. As used herein, the terms machine-readable medium and computer-readable medium refer to any computer program product, apparatus and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term machine-readable signal refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user and a keyboard and a pointing device (e.g., a mouse or a trackball) by which the user can provide input to the computer. Other kinds of devices can be used to provide for interaction with a user as well; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front end component (e.g., a client computer having a graphical user interface or a Web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back end, middleware, or front end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network (LAN), a wide area network (WAN), and the Internet.
The computing system can include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. A number of implementations have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of this disclosure. For example, preferable results may be achieved if the steps of the disclosed techniques were performed in a different sequence, if components in the disclosed systems were combined in a different manner, or if the components were replaced or supplemented by other components. The functions, processes and algorithms described herein may be performed in hardware or software executed by hardware, including computer processors and/or programmable circuits configured to execute program code and/or computer instructions to execute the functions, processes and algorithms described herein. Additionally, an implementation may be performed on modules or hardware not identical to those described. Accordingly, other implementations are within the scope that may be claimed.