Datasets which analyze arbitrage opportunities across various rollups on Ethereum (Arbitrum, Base, and Optimism). This research examines potential arbitrage paths both within individual chains (single-chain) and across different L2 networks (cross-chain), identifying profitable trading opportunities created by price discrepancies.
The schema of each dataset is included below. The scripts for running the data pipeline outlined in this document are included in the repository.
The dataset is curated by Christof Torres and is part of the TLDR 2025 fellowship program.
name: singlechainprofitable_paths.json
description: Contains profitable arbitrage paths identified within a single chain
date_range: Sample data from November 2024
size: 108.9 MB
structure:
source: Path Simulator output from processed Dune and Allium data
blockchain: Arbitrum, Base, Optimism
variables:
_id
[Object]: MongoDB document identifierid
[String]: Unique identifier for the arbitrage opportunitychain
[String]: The blockchain network (base, arbitrum, or optimism)block_number
[Integer]: Block number where the arbitrage opportunity was identifiedtimestamp_range_start
[Integer]: Unix timestamp of the start of the time rangetimestamp_range_stop
[Integer]: Unix timestamp of the end of the time rangenumber_of_updated_pools
[Integer]: Count of pools that updated within the time rangenumber_of_updated_paths
[Integer]: Count of paths that were affected by pool updatesnumber_of_paths_with_positive_gains
[Integer]: Count of paths with theoretical positive returnsnumber_of_simulated_paths
[Integer]: Count of paths that were fully simulatednumber_of_profitable_non_conflicting_paths
[Integer]: Count of profitable paths that don't conflict with each otherprofitable_non_conflicting_paths
[Array]: List of profitable paths that can be executed togethertotal_profit_usd
[Float]: Total potential profit in USD from all profitable non-conflicting pathsname: crosschainprofitable_paths.json
description: Contains profitable arbitrage paths that span multiple L2 networks
date_range: Sample data from November 2024
size: 156.7 MB
structure:
source: Path Simulator output from processed Dune and Allium data
blockchain: Arbitrum, Base, Optimism
variables:
_id
[Object]: MongoDB document identifierid
[String]: Unique identifier for the cross-chain arbitrage opportunitytimestamp_range_start
[Integer]: Unix timestamp of the start of the time rangetimestamp_range_stop
[Integer]: Unix timestamp of the end of the time rangenumber_of_updated_pools
[Integer]: Count of pools that updated within the time rangenumber_of_updated_paths
[Integer]: Count of paths that were affected by pool updatesnumber_of_paths_with_positive_gains
[Integer]: Count of paths with theoretical positive returnsnumber_of_simulated_paths
[Integer]: Count of paths that were fully simulatednumber_of_profitable_non_conflicting_paths
[Integer]: Count of profitable paths that don't conflict with each otherprofitable_non_conflicting_paths
[Array]: List of profitable paths that can be executed across chainstotal_profit_usd
[Float]: Total potential profit in USD from all profitable non-conflicting pathsWithin profitable_non_conflicting_paths
, each path contains:
path
[Array]: Sequential list of swap operations across different chainsprofit_usd
[Float]: Potential profit in USD from this specific pathamount_usd
[Float]: USD value of the initial amount used in the pathEach item in the path
array contains:
chain
[String]: The blockchain network for this swapaddress
[String]: Contract address of the liquidity pool usedprotocol
[String]: DEX protocol (e.g., uniswap)version
[String]: Protocol version (e.g., v2, v3)token_in_symbol
[String]: Symbol of the input tokentoken_out_symbol
[String]: Symbol of the output tokentoken_in_address
[String]: Contract address of the input tokentoken_out_address
[String]: Contract address of the output tokentoken_in_decimals
[String]: Number of decimals for the input tokentoken_out_decimals
[String]: Number of decimals for the output tokenfee_tier
[String]: Fee tier of the pool (for Uniswap V3)zero_for_one
[Boolean]: Direction of the swap in the pooltoken_in_amount
[Float]: Amount of input token usedtoken_out_amount
[Float]: Amount of output token receivedAn overview of the data pipeline for this paper is shown in the diagram and described below:
!Arbitrage Simulation Pipeline
A more detailed breakdown of this data pipeline is outlined below.