分区表join时,过滤分区,以下两种写法都可以,其执行过程相同
- 过滤条件写在on中 ```sql explain select a.city_name,b.name from crisps_ods.ods_crisps_mch_address a left join crisps_ods.ods_crisps_mch_detail b on a.mch_detail_id = b.id and b.p_date = ‘2022-03-31’;
STAGE DEPENDENCIES: Stage-4 is a root stage , consists of Stage-5, Stage-1 Stage-5 has a backup stage: Stage-1 Stage-3 depends on stages: Stage-5 Stage-1 Stage-0 depends on stages: Stage-3, Stage-1
STAGE PLANS: Stage: Stage-4 Conditional Operator
Stage: Stage-5 Map Reduce Local Work Alias -> Map Local Tables: $hdt$_1:b Fetch Operator limit: -1 Alias -> Map Local Operator Tree: $hdt$_1:b TableScan alias: b Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE Filter Operator predicate: (p_date = ‘2022-03-31’) (type: boolean) Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE Select Operator expressions: id (type: bigint), name (type: varchar(128)) outputColumnNames: _col0, _col1 Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE HashTable Sink Operator keys: 0 _col0 (type: bigint) 1 _col0 (type: bigint)
Stage: Stage-3 Map Reduce Map Operator Tree: TableScan alias: a Statistics: Num rows: 22040 Data size: 1278351 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: mch_detail_id (type: bigint), city_name (type: varchar(50)) outputColumnNames: _col0, _col1 Statistics: Num rows: 22040 Data size: 1278351 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Left Outer Join0 to 1 keys: 0 _col0 (type: bigint) 1 _col0 (type: bigint) outputColumnNames: _col1, _col3 Statistics: Num rows: 24244 Data size: 1406186 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col1 (type: varchar(50)), _col3 (type: varchar(128)) outputColumnNames: _col0, _col1 Statistics: Num rows: 24244 Data size: 1406186 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 24244 Data size: 1406186 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Local Work: Map Reduce Local Work
Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: a Statistics: Num rows: 22040 Data size: 1278351 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: mch_detail_id (type: bigint), city_name (type: varchar(50)) outputColumnNames: _col0, _col1 Statistics: Num rows: 22040 Data size: 1278351 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: bigint) sort order: + Map-reduce partition columns: _col0 (type: bigint) Statistics: Num rows: 22040 Data size: 1278351 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: varchar(50)) TableScan alias: b Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE Filter Operator predicate: (p_date = ‘2022-03-31’) (type: boolean) Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE Select Operator expressions: id (type: bigint), name (type: varchar(128)) outputColumnNames: _col0, _col1 Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE Reduce Output Operator key expressions: _col0 (type: bigint) sort order: + Map-reduce partition columns: _col0 (type: bigint) Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE value expressions: _col1 (type: varchar(128)) Reduce Operator Tree: Join Operator condition map: Left Outer Join0 to 1 keys: 0 _col0 (type: bigint) 1 _col0 (type: bigint) outputColumnNames: _col1, _col3 Statistics: Num rows: 24244 Data size: 1406186 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col1 (type: varchar(50)), _col3 (type: varchar(128)) outputColumnNames: _col0, _col1 Statistics: Num rows: 24244 Data size: 1406186 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 24244 Data size: 1406186 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink
- 过滤条件写在where中
```sql
explain select
a.city_name,b.name from crisps_ods.ods_crisps_mch_address a
left join ( select * from crisps_ods.ods_crisps_mch_detail where p_date = '2022-03-31') b
on a.mch_detail_id = b.id ;
STAGE DEPENDENCIES:
Stage-4 is a root stage , consists of Stage-5, Stage-1
Stage-5 has a backup stage: Stage-1
Stage-3 depends on stages: Stage-5
Stage-1
Stage-0 depends on stages: Stage-3, Stage-1
STAGE PLANS:
Stage: Stage-4
Conditional Operator
Stage: Stage-5
Map Reduce Local Work
Alias -> Map Local Tables:
$hdt$_1:ods_crisps_mch_detail
Fetch Operator
limit: -1
Alias -> Map Local Operator Tree:
$hdt$_1:ods_crisps_mch_detail
TableScan
alias: ods_crisps_mch_detail
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Filter Operator
predicate: (p_date = '2022-03-31') (type: boolean)
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Select Operator
expressions: id (type: bigint), name (type: varchar(128))
outputColumnNames: _col0, _col1
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
HashTable Sink Operator
keys:
0 _col0 (type: bigint)
1 _col0 (type: bigint)
Stage: Stage-3
Map Reduce
Map Operator Tree:
TableScan
alias: a
Statistics: Num rows: 22040 Data size: 1278351 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: mch_detail_id (type: bigint), city_name (type: varchar(50))
outputColumnNames: _col0, _col1
Statistics: Num rows: 22040 Data size: 1278351 Basic stats: COMPLETE Column stats: NONE
Map Join Operator
condition map:
Left Outer Join0 to 1
keys:
0 _col0 (type: bigint)
1 _col0 (type: bigint)
outputColumnNames: _col1, _col3
Statistics: Num rows: 24244 Data size: 1406186 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: _col1 (type: varchar(50)), _col3 (type: varchar(128))
outputColumnNames: _col0, _col1
Statistics: Num rows: 24244 Data size: 1406186 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
Statistics: Num rows: 24244 Data size: 1406186 Basic stats: COMPLETE Column stats: NONE
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Local Work:
Map Reduce Local Work
Stage: Stage-1
Map Reduce
Map Operator Tree:
TableScan
alias: a
Statistics: Num rows: 22040 Data size: 1278351 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: mch_detail_id (type: bigint), city_name (type: varchar(50))
outputColumnNames: _col0, _col1
Statistics: Num rows: 22040 Data size: 1278351 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: bigint)
sort order: +
Map-reduce partition columns: _col0 (type: bigint)
Statistics: Num rows: 22040 Data size: 1278351 Basic stats: COMPLETE Column stats: NONE
value expressions: _col1 (type: varchar(50))
TableScan
alias: ods_crisps_mch_detail
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Filter Operator
predicate: (p_date = '2022-03-31') (type: boolean)
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Select Operator
expressions: id (type: bigint), name (type: varchar(128))
outputColumnNames: _col0, _col1
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: bigint)
sort order: +
Map-reduce partition columns: _col0 (type: bigint)
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
value expressions: _col1 (type: varchar(128))
Reduce Operator Tree:
Join Operator
condition map:
Left Outer Join0 to 1
keys:
0 _col0 (type: bigint)
1 _col0 (type: bigint)
outputColumnNames: _col1, _col3
Statistics: Num rows: 24244 Data size: 1406186 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: _col1 (type: varchar(50)), _col3 (type: varchar(128))
outputColumnNames: _col0, _col1
Statistics: Num rows: 24244 Data size: 1406186 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
Statistics: Num rows: 24244 Data size: 1406186 Basic stats: COMPLETE Column stats: NONE
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Stage: Stage-0
Fetch Operator
limit: -1
Processor Tree:
ListSink