Spark2 SQL动态分区报错 - 《大数据》

报错信息如下：

ERROR yarn.ApplicationMaster: User class threw exception: org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: Number of dynamic partitions created is 3464, which is more than 1000. To solve this try to set hive.exec.max.dynamic.partitions to at least 3464.;
org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: Number of dynamic partitions created is 3464, which is more than 1000. To solve this try to set hive.exec.max.dynamic.partitions to at least 3464.;
    at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:107)
    at org.apache.spark.sql.hive.HiveExternalCatalog.loadDynamicPartitions(HiveExternalCatalog.scala:829)
    at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult$lzycompute(InsertIntoHiveTable.scala:319)
    at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult(InsertIntoHiveTable.scala:221)
    at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.doExecute(InsertIntoHiveTable.scala:413)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135)

经过查找修改添加spark代码

hiveContext.sql("hive.exec.dynamic.partition=true")
hiveContext.sql("set hive.exec.dynamic.partition.mode=nonstrict")
hiveContext.sql("SET hive.exec.max.dynamic.partitions=100000")
hiveContext.sql("SET hive.exec.max.dynamic.partitions.pernode=100000")

依旧报错。。
https://github.com/apache/spark/pull/17223
经过网上提示，spark2以后不支持代码修改hive配置信息。
于是尝试修改hive-site.xml，添加如下配置：

<property>
      <name>hive.exec.dynamic.partition</name>
      <value>true</value>
    </property>
  <property>
      <name>hive.exec.dynamic.partition.mode</name>
      <value>nonstrict</value>
  </property>
  <property>
      <name>hive.exec.max.dynamic.partitions</name>
      <value>100000</value>
   </property>
   <property>
      <name>hive.exec.max.dynamic.partitions.pernode</name>
      <value>100000</value>
  </property>

运行成功了！

hive.exec.dynamic.partition

默认值：false
是否开启动态分区功能，默认false关闭。
使用动态分区时候，该参数必须设置成true;

hive.exec.dynamic.partition.mode

默认值：strict
动态分区的模式，默认strict，表示必须指定至少一个分区为静态分区，nonstrict模式表示允许所有的分区字段都可以使用动态分区。
一般需要设置为nonstrict

hive.exec.max.dynamic.partitions.pernode

默认值：100
在每个执行MR的节点上，最大可以创建多少个动态分区。
该参数需要根据实际的数据来设定。
比如：源数据中包含了一年的数据，即day字段有365个值，那么该参数就需要设置成大于365，如果使用默认值100，则会报错。

hive.exec.max.dynamic.partitions

默认值：1000
在所有执行MR的节点上，最大一共可以创建多少个动态分区。
同上参数解释。

hive.exec.max.created.files

默认值：100000
整个MR Job中，最大可以创建多少个HDFS文件。
一般默认值足够了，除非你的数据量非常大，需要创建的文件数大于100000，可根据实际情况加以调整。

hive.error.on.empty.partition

默认值：false
当有空分区生成时，是否抛出异常。
一般不需要设置。