distcp from Hadoop cluster to AWS S3 bucket
This article may help you to copy data between on Premises hadoop cluster and AWS S3 bucket.
Knowledge is Power
This article may help you to copy data between on Premises hadoop cluster and AWS S3 bucket.
Below bash command will let you find the version of packages for your python interpreter. Make sure you are running the correct version of python enterpreter. Update: 2020-05-14 for i in pandas numpy sqlalchemy logging logging.handlers datetime sys re os...
At times there is a need to merge multiple consecutive lines to one. paste command makes it very easy to merge lines. Syntax: paste -d’,’ – – < input_file See the example below, robin@Home-XPS:~$ echo -e ‘1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12’ 1 2 3...
You will find a few similarities with Homebrew (https://brew.sh/) and linuxbrew (http://linuxbrew.sh/). Chocolatey (https://chocolatey.org/) is a similar package manager for windows. Installation Installation command is a little weird than other OS. Inside an administrative command prompt, type below >”%SystemRoot%\System32\WindowsPowerShell\v1.0\powershell.exe” -NoProfile -InputFormat None -ExecutionPolicy Bypass -Command “iex ((New-Object System.Net.WebClient).DownloadString(‘https://chocolatey.org/install.ps1’))” && SET “PATH=%PATH%;%ALLUSERSPROFILE%\chocolatey\bin”...
Top 20 files by size Concatenate multiple files skipping n rows Compress last X days data in one archive Compress files older than X days individually Merging multiple lines Find latest files recursively Get rows starting and ending between a...
Split program output to console and to a file Basic usage $ command | tee out_file.txt Appending to the output file using -a $ command | tee -a out_file.txt Redirecting stderr and stdout $ command |& tee -a out_file.txt Example...
Usage Basic $ top View by process name # top -p $(pgrep -d’,’ <process name>) # sample below, top -p $(pgrep -d’,’ http) View by user top -u <user name> Sort by mem # Inside top Shift+m View command...
Using python with json.tool, $python -m json.tool <filename.json> Or use something like in line 2, below echo ‘{“employee_id”:1,”full_name”:”Sheri Nowmer”,”first_name”:”Sheri”,”last_name”:”Nowmer”,”position_id”:1,”position_title”:”President”,”store_id”:0,”department_id”:1,”birth_date”:”1961-08-26″,”hire_date”:”1994-12-01 00:00:00.0″,”end_date”:null,”salary”:80000.0000,”supervisor_id”:0,”education_level”:”Graduate Degree”,”marital_status”:”S”,”gender”:”F”,”management_role”:”Senior Management”}’ > /tmp/emp.json cat /tmp/emp.json | python -m json.tool Related posts: AWK quick reference How-to: Install Hue...
Tip #1 Quick list of operations Where? Example: /var/log/hadoop/hdfs cat hdfs-audit.log | awk ‘{cmds[$9]++}END{for (i in cmds)printf “%s %d\n”,i,cmds[i]}’ Results [user@server hdfs]$ cat hdfs-audit.log | awk ‘{cmds[$9]++}END{for (i in cmds)printf “%s %d\n”,i,cmds[i]}’ cmd=setTimes 52 cmd=listStatus 47422 cmd=create 36932 cmd=getfileinfo 7431182...
Well the post name is a misnomer. Lately I’ve been using ldapsearch command to test connection to ldap server, hence the name. Following is the shell command, $ ldapsearch -x -LLL -h ldapserver.example.com -D username -w password -b “DC=example,DC=com” -s...
More