How to choose no buckets?
Depends on data size of table
Data Size of / 128 MB = 31
2 ^ n > 31
See no of partitions
show partitions txnrecords_dp;
Drop/Remove a partition from metastore
show partitions txnrecords_dp;
alter table txnrecords_dp DROP IF EXISTS PARTITION(category="Games");
show partitions txnrecords_dp;
This will remove the partition from HDFS as well
Same applies for Internal/External table
Add partition
alter table txnrecords_dp ADD PARTITION(category="Games");
show partitions txnrecords_dp;
Add multiple in a single command
alter table txnrecords_dp ADD PARTITION(category="Games-1") PARTITION(category="Games-2");
When you add a new partition, it will automatically load data into Hive
Edit Partition
alter table txnrecords_dp
Delete partition under Hdfs end but not in Metastore/Hive
hdfs dfs -rm -r /apps/hive/xyz/txnrecords_dp/category=Puzzles
Check if partition on hive
Show partitions txnrecords_dp;
Still it is available, metastore/hive will not recognize the HDFS changes
You have to make it recognize using msck repair command
How to sync from HDFS to MetaStore/Hive
hive> msck repair table txnrecords_dp;
How to create a new table from an existing partitioned table (new table should not contain partitions) ?
create table txnrecords_dp_bkp as select * from txnrecords_dp;
Map reduce job will run as it is copying
Only data & schema is copied, not partitions
show partitions txnrecords_dp_bkp;
No partitions are copied
Create only backup table only schema from existing table
create table txnrecords_dp_bkp_schema LIKE txnrecords_dp;
Only schema & partitions is copied (but no data)
Run HQL command
hive -f test.hql
or
hive -e test.hql
hive -e (run hive queries)
hive -e "select * from batch5_hadoop.txnrecords_dp imit 10"
Hive - rename a table
hive> alter table txnrecords_dp_bkp_schema RENAME to txnrecords_dp_bkp_schema1
show create table txnrecords_dp_bkp_schema;
Hive - case
select id, name, marks,
CASE WHEN marks>65 THEN 'PASS'
CASE WHEN marks<65 THEN FAIL'
ELSE 'NO_MARKS'
END as result
from student_data
PySpark, BigData, SQL, Hive, AWS, Python, Unix/Linux, Shortcuts, Examples, Scripts, Perl
May 13, 2023
Hive Interview Questions
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment