Linux and Hadoop commands for data engineers!

Brinda Potluri
2 min readNov 2, 2023

--

Interested in learning Big Data concepts from basics to advance just through 1 article every week? Hit the follow button!

Linux follows a tree-like directory structure (hierarchy). Every path starts with the root, and everything will be inside that root directory. In Windows, we call it a folder; in Linux, we call it a directory.

Want to check which is the present working directory?

→ pwd

Want to know who is logged in?

→ whoami

Want to clear the screen?

→ clear

Want to change the directory?

→ cd

Want to change the directory to root?

→ cd /

Want to go to the home directory from anywhere?

→cd ~

Want to go to the previous directory?

→ cd -

Hadoop Commands:

Prefix your Linux commands with ‘hadoop ls’ or ‘hdfs dfs’ and they’ll function like Hadoop commands.

Some commands that are specific to HDFS that are used for data movement are:

— -> hadoop fs -put /local_file_path /hdfs_path: To bring the file from local (gateway node) to HDFS. You can also use ‘hadoop fs -copyFromLocal /local_file_path /hdfs_path‘ to do the same task.

— ->hadoop fs -copyToLocal /hdfs_path .: To bring a file from HDFS to local. Here ‘.’ means the current location. You can also use the ‘get’ command like above to do the same task here.

These were some of the most important commands!

Hit the clap, comment your views if you got any value from this article (You can clap up to 50 times!), your appreciation means a lot to me :)

Feel free to connect and message me on my LinkedIn.

References:

  1. Sumit Sir’s Big Data

--

--

No responses yet