Theta-join SQL Operators Optimization on Distributed Systems
This project aims to design algorithms to accelerate theta-join SQL operators on distributed systems.
Project Details
- Algorithms:
- “Equal” SQL operator acceleration algorithm
- “Less than” SQL operator acceleration algorithm
- “More than” SQL operator acceleration algorithm
- Environment:
- Memory: 40 * 50G
- CPU Cores: 40 * 7
- Platform: Spark 2.2.0 + JDK 1.8
- Data Set:
- Performance:
- 5 ~ 400 times faster than Spark SQL.
People