Research Spotlight: Payment for Order Flow and Price Improvement

By Bradford (Lynch) Levy

By most accounts, the past decade has heralded a new age in retail investing—by eliminating commissions retail brokers have “democratized finance.” In place of commissions, retail brokers now rely on payment for order flow (PFOF) to drive revenue. While the elimination of commissions sounds great, there is ongoing debate as to whether PFOF benefits retail investors. In a new study, I use a randomized experiment to explore this question. I find that PFOF is neither unambiguously good nor bad, and that better measures of market conditions, execution quality disclosures, and a more precise definition of “best execution” would enable retail investors to better protect themselves from conflicts of interest that might arise due to PFOF.

In effect, PFOF entails a wholesaler paying the retail broker for the right to trade against the broker’s order flow. Those in favor of PFOF argue that the practice provides investors with two main benefits: (i) better prices than those on exchanges—price improvement—and (ii) greater liquidity—size improvement, e.g., Cifu (2021). Those against PFOF argue that the practice stymies competition—brokers make routing decisions on behalf of customers rather than allowing market makers to compete over order flow, e.g., Gensler (2022). While both arguments are plausible, there is limited evidence on the subject.

One reason for the lack of evidence is the need to demonstrate that orders executed on-exchange would have executed at better prices had they been routed via PFOF. I address this challenge by conducting a randomized controlled trial that trades random stocks at random times across random brokers. The brokers include one providing direct market access and the two largest PFOF-based brokers by revenue (TD Ameritrade and Robinhood).

In practice, price improvement (PI) is measured by comparing a trade’s execution price to the national best bid and offer (NBBO) and measured as the dollar amount of improvement divided by share price. Another measure is effective spread over quoted spread (EFQ), which measures how much of the quoted half-spread an investor paid to trade. For example, if a buy order executes at the quoted ask price, then EFQ is equal to 100% because the investor paid the full half-spread.

One critique of PFOF is that the NBBO is a poor measure of market conditions, i.e., liquidity is usually available at better prices than the NBBO—vis a vis orders for less than 100 shares known as odd-lots and hidden order types—thereby overstating price improvement. Consistent with this argument, I find that most direct orders execute at better prices than the NBBO, receiving 4 basis points of PI on average. To put this in context, prior literature has estimated PI at 5 to 9 basis points. This suggests that using the NBBO as a benchmark overstates PI by as much as 400%, i.e., removing the 4 basis point bias leads to actual PI of 1 to 5 basis points.

Examining PI across brokers, I find economically and statistically significant heterogeneity. Figure 1 presents the proportion of orders which execute at a given EFQ or better. Using direct orders as the benchmark, roughly 20% execute at the mid-price or better (with an EFQ of 0% or better). Consistent with the notion that PFOF can benefit retail investors, more than 75% of orders routed to TD Ameritrade execute at the mid-price or better. In contrast, only 25% of orders routed to Robinhood execute at the mid-price or better—which is not statistically different from that of the benchmark. Put another way, I find that Robinhood does not provide PI after controlling for the true market conditions.

Figure 1. Execution Quality Across Brokers

The results are striking because both TD Ameritrade and Robinhood use the same wholesalers. Examining differences in the PFOF received from a given wholesaler, the lack of PI at Robinhood is explained by the amount of PFOF received. For example, Susquehanna pays TD Ameritrade $0.10 per hundred shares and delivers mid-price execution. In contrast, the same wholesaler pays Robinhood $0.75 per hundred shares and delivers zero price improvement. This suggests that Robinhood’s agreements with wholesalers sacrifice PI in exchange for increased PFOF—exactly the conflict of interest that Chairman Gensler has expressed concerns about.

My study has several policy implications. First, the empirical fact that orders submitted directly to markets receive significant price improvement highlights that the NBBO is a poor benchmark of market conditions. This price improvement comes from two channels: odd-lot liquidity which is excluded from the NBBO per Regulation NMS and hidden order types. While it is difficult to include hidden orders into the NBBO, my results suggest that the inclusion of odd-lots would be a marked improvement. Along these lines, in 2020 the SEC adopted rules to redefine round-lots such that orders for less than 100 shares will contribute to the NBBO depending on the price of the specific stock. However, two years later, these rules have yet to be implemented and some market participants have argued that “[a] better solution involves a market where all orders count equally” toward the NBBO regardless of size (NASDAQ, 2021). A simple solution, as suggested by NASDAQ, is to include all orders in the NBBO regardless of size.

Second, my study shows that PFOF does not unambiguously benefit or harm investors. Instead, broker-specific practices affect who benefits from PFOF. If consumers could readily discern the differences in execution quality across brokers, then this alone would not be a problem. However, these differences cannot be inferred from the current disclosure regime, thus consumers would need to run an experiment similar to my study in order to ascertain the differences. One way to address this problem is to have a regulator or trusted third-party run a continuous experiment to measure execution quality across brokers and publish those statistics for consumers to use when selecting a broker—notably, similar information is frequently produced for a variety of products, e.g., Consumer Reports.

Finally, FINRA Rule 5310 requires brokers to use reasonable diligence “so that the resultant price to the customer is as favorable as possible under prevailing market conditions.” The study highlights that some brokers systematically deliver more favorable prices than others under the same market conditions. However, the legal precedent and language in Rule 5310 is sufficiently vague that it is unclear whether FINRA would consider the systematic differences I identify to be a violation of the rule. FINRA could address these shortcomings by specifically stating the definition of “prevailing market conditions,” e.g., the NBBO if that is their benchmark, and the threshold for which differences across brokers are considered unfavorable. This would empower investors to raise material issues to FINRA while suppressing nonmaterial ones.

Bradford (Lynch) Levy is a PhD Candidate in Accounting at The Wharton School.

The views and ideas expressed in this post are those of the author and do not necessarily represent those of the Wharton School or the Wharton Initiative on Financial Policy and Regulation.