How to make vertica backup

In some cases you don`t need 3 node vertica cluster and Ksafety.
We use vertica as very fast column based database + etl and database size only 50 Gb. So we can easy restore vertica from backup and use etl log processing to get actual data.

simple backup commands.

1. Create config file

/opt/vertica/bin/vbr --setupconfig

2. initialize backup storage

$ /opt/vertica/bin/vbr  --task init --config-file /home/dbadmin/leadada_snapshot.ini
Initializing backup locations.
Backup locations initialized.

3. And finally!!

$ vbr.py --task backup --config-file /home/dbadmin/leadada_snapshot.ini
Starting backup of database leadada.
Participating nodes: v_leadada_node0001.
Snapshotting database.
Snapshot complete.
Approximate bytes to copy: 37754604170 of 37754604170 total.
[==================================================] 100%
Copying backup metadata.
Finalizing backup.
Backup complete!

How to get process memory consumption list linux

Pretty easy
for resident memory consumption

ps -e -orss=,args= | sort -b -k1,1n

for virtual memory consumption

ps -e -ovsz=,args= | sort -b -k1,1n

Linux sort is great!
-k1,1n
means sort by 1st column in numeric order

Accordin to official manual:

`--key=POS1[,POS2]'
     Specify a sort field that consists of the part of the line between
     POS1 and POS2 (or the end of the line, if POS2 is omitted),
     _inclusive_.

     Each POS has the form `F[.C][OPTS]', where F is the number of the
     field to use, and C is the number of the first character from the
     beginning of the field.  Fields and character positions are
     numbered starting with 1; a character position of zero in POS2
     indicates the field's last character.  If `.C' is omitted from
     POS1, it defaults to 1 (the beginning of the field); if omitted
     from POS2, it defaults to 0 (the end of the field).  OPTS are
     ordering options, allowing individual keys to be sorted according
     to different rules; see below for details.  Keys can span multiple
     fields.

     Example:  To sort on the second field, use `--key=2,2' (`-k 2,2').
     See below for more notes on keys and more examples.  See also the
     `--debug' option to help determine the part of the line being used
     in the sort.

How to send passive checks to nagios real life example:

First of all – why you need to use passive checks in nagios.
It`s useful for large systems, nagios will not wait for connect timeout during telecom issues.
And it`s easy to configure.

Our case (large social network).
Need to check number of unsubscribers. If no “unsubscribe” letters for 1 hour – something goes wrong… FBL list not working and we need Alert. If we will not process FBL letters for several hours, email providers rise our SPAM rating.

How to fetch letters (I use ruby Imap) – topic for another article :).

1. Nagios Check code:

# cat /home/scripts/fbl.sh
#!/bin/bash

NUM=`/usr/bin/psql -t -h 1.1.1.1 -p 5450 -U cron_user  base3 -c "select count(1) from email_stop_list where (esl_created BETWEEN current_timestamp - interval '1 hour' and current_timestamp) and esl_reason ~ '^fbl'"`

if [ $NUM -eq 0 ]; then
        echo -e "nest\tunsubscribe_fbl\t3\tNo_Unsubscribe"  | /home/scripts/send_nsca -H 2.2.2.2 -p 5667 -c /etc/send_nsca.conf
 else
    echo -e "nest\tunsubscribe_fbl\t0\t$NUM unsubscribes last houer"  | /home/scripts/send_nsca -H 2.2.2.2 -p 5667 -c /etc/send_nsca.conf
 fi

2. Code for send_nsca

Plugin Return Code Service State Host State
0 OK UP
1 WARNING UP or DOWN/UNREACHABLE*
2 CRITICAL DOWN/UNREACHABLE
3 UNKNOWN DOWN/UNREACHABLE

3. Nginx service config

# cat nest.cfg
define service{
  use                            generic-service-template-passive
  host_name                       nest
  service_description             unsubscribe_fbl
  freshness_threshold             3600
  check_command                   volatile_no_information
  contact_groups                  nagios-wheel,nagios-wheel-smsmail
}

4. Service template

define service {
    use                             generic-service-template
    name                            generic-service-template-passive
    active_checks_enabled           0
    passive_checks_enabled          1
    obsess_over_service             0
    flap_detection_enabled          0
    event_handler_enabled           1
    failure_prediction_enabled      1
    is_volatile                     1
    register                        0
    check_period                    24x7
    max_check_attempts              1
    normal_check_interval           5
    retry_check_interval            2
    check_freshness                 1
    freshness_threshold             90000
    contact_groups                  nagios-wheel
    check_command                   volatile_no_information
    notifications_enabled           1
    notification_interval           15
    notification_period             24x7
    notification_options            w,u,c,r
    process_perf_data               1
    retain_status_information       1
    retain_nonstatus_information    1
}

How to tar.gz yesterday logs (some etl magic)

Task: need to tar yesteday logs in one file and gzip it.
Little bash code, just to save my time in future.

#!/bin/bash

src='/var/spool/etl/archive'

dt=`date --date="1 day ago" +"%Y-%m-%d"`
#create empty tar archive
tar cvf $src/$dt.tar --files-from /dev/null

for i in `ls -1 $src/*$dt* | grep -v gz | grep -v tar`; do
  tar -rf $src/$dt.tar $i
  rm -f $i
done
gzip $src/$dt.tar

video sound track merger

Few “years” ago I made part of small promo project for Nestle Russia as subcontractor.
It was promo action, website with some videos. Kids make sound track and my task was to merge user sound track and original video soundtrack.
It`s really easy to do with ffmpeg or mencoder.

BTW: ffmeg much better it works o.k. with aac codec and mp4 container.

Code is VERY VERY dirty, we had absolutely no time, but it can be useful to someone. And I save it “just to remember”.

#!/bin/bash

( #start subprocess
  # Wait for lock on /var/lock/.merger-1.lock (fd 200) for 10 seconds
  flock -x -w 3 200
  if [ "$?" != "0" ]; then echo Cannot lock!; exit 1; fi
  echo $$>>/var/lock/.merger-1.lock #for backward lockdir compatibility, notice this command is executed AFTER command bottom  ) 200>/var/lock/.myscript-1.exclusivelock.


sourcevideo="/var/www/kinder_prod/sourcevideo"
sourceaudio="/var/www/kinder_prod/audioupload"
targetdir="/var/www/kinder_prod/processedvideo"
processedaudio="/var/www/kinder_prod/processedaudio"

while true; do

if [ "$(ls -A $sourceaudio)" ]; then

  for i in `ls -1 $sourceaudio/*.wav | xargs -n1 basename`; do
  videoid=`echo $i | awk -F"--" '{print $1}'`
  audioid=`echo $i | awk -F"--" '{print $2}' | awk -F"." '{print $1}'`

  sox $sourceaudio/$i /tmp1/$i rate 44100; mv /tmp1/$i $sourceaudio/$i; chown milkslice:milkslice $sourceaudio/$i || exit 1

  sox -m $sourcevideo/$videoid.mp3 $sourceaudio/$i /tmp1/$videoid--$audioid.mp3 && \
  ffmpeg -y -i /tmp1/$videoid--$audioid.mp3 -strict experimental -acodec aac -bsf:a aac_adtstoasc /tmp1/$videoid--$audioid.aac && \
    ffmpeg -y -i /tmp1/$videoid--$audioid.aac -i $sourcevideo/$videoid.mp4 -bsf:a aac_adtstoasc -preset ultrafast -c copy $targetdir/$videoid--$audioid.mp4 || exit 1
#   mencoder -of lavf -lavfopts format=mp4 -oac copy  -fafmttag 0x706D  \
#-audiofile /tmp1/$videoid--$audioid.aac  -ovc copy $sourcevideo/$videoid.mp4 -o $targetdir/$videoid--$audioid.mp4 || exit 1
    chown milkslice:milkslice $targetdir/$videoid--$audioid.mp4
    mv -f $sourceaudio/$i $processedaudio
    rm /tmp1/$videoid--$audioid.mp3
        rm /tmp1/$videoid--$audioid.aac

    done

fi

sleep 1;
done

) 200>/var/lock/.merger-1.lock   #exit subprocess

FLOCKEXIT=$?  #save exitcode status

exit $FLOCKEXIT

And run screen with script. (alternative to upstart)

/usr/bin/screen -dm bash -c 'cd /root/merger-prod; /root/merger-prod/merger-prod.sh'

How to build dpkg from pecl

We need new mongo driver.
pecl install lastest
is not good solution, leads to chaos in system.

get the desired mongo extension tgz
http://pecl.php.net/package/mongo

aptitude install  dh-make-php php5-dev build-essential debhelper

wget http://pecl.php.net/get/mongo-1.6.11.tgz
OR pecl download mongo
dh-make-pecl --phpversion 5 --prefix php5- mongo-1.6.11.tgz
./debian/rules binary

Great Thanks to author: https://www.dotdeb.org/2008/09/25/how-to-package-php-extensions-by-yourself/
He saved my day.

How to delete files without big iowait

I know 2 ways, tested in high loaded production.

if scheduler support ionice (on some systems makes LA)

 # ionice -c 3 nice -n 20 find  /DIRECTORY -type f -delete

Just ajust sleep time, according to your system LA

while true; do find /DIRECTORY/ -type f -print  -delete -quit; sleep 0.01; done

mysql 5.6 GTID global transaction identifier

Wow! It`s a really nice feature. Now you can do very easy replication.
i.e. In pre 5.6 you should create replica like this:

1. Turn on binary logs at master

 vi /etc/mysql/my.cnf
 server-id              = 11
 log_bin                 = /var/log/mysql/mysql-bin.log
 # WARNING: Using expire_logs_days without bin_log crashes the server! See README.Debian!
 expire_logs_days        = 10
 max_binlog_size         = 100M
 binlog_do_db            = mydatabase
 #binlog_ignore_db       = include_database_name
 binlog-format=ROW     #I MIXED and STATEMENT sometimes not good
binlog-checksum=crc32  # 5.6 feature speed up binlog
gtid-mode=on           #Use force, Luke

2. Create replication User

 grant replication slave on *.* to 'repl_user'@'%' identified by 'SecurePassword';

3. Dump all databases

mysqldump --master-data=2 --single-transaction --events --routines --triggers --all-databases  > database.sql

4. On slave after dump restore

 CHANGE MASTER TO MASTER_HOST='masterHost', MASTER_USER='repl_user',MASTER_LOG_FILE=, MASTER_LOG_POS=,
 MASTER_PASSWORD='SecurePassword';
 START SLAVE;
 show slave status;

But at 5.6 On slave

change master to MASTER_HOST='masterHost", MASTER_AUTO_POSITION=1, MASTER_USER=’repl_user’, MASTER_PASSWORD=’SecurePassword';
START SLAVE;
show slave status;

P.S. If you need to skip one request on slave:

SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1; START SLAVE;

nginx proxy_pass and cache regexp location.

nginx cannot proxy_pass at regexp location. I made this workaround.
Works great! Now I can cache any static data provided by backend. 🙂 from any location!

location ~* \.(gif|jpg|png|ico)$ {
      rewrite ^.(gif|jpg|png|ico) /$1 break;
      proxy_pass         http://127.0.0.1:8080;
      proxy_redirect     off;
      proxy_set_header    Host             $host;
      proxy_set_header    X-Real-IP        $remote_addr;

      proxy_set_header   X-Forwarded-For  $proxy_add_x_forwarded_for;
      client_max_body_size       150m;
      client_body_buffer_size    128k;
      proxy_connect_timeout      90;
      proxy_send_timeout         90;
      proxy_read_timeout         90;
      proxy_buffer_size          4k;
      proxy_buffers              4 32k;
      proxy_busy_buffers_size    64k;
      proxy_temp_file_write_size 64k;

      proxy_cache cache_common;
      proxy_cache_key "$host|$request_uri";
      proxy_cache_valid 200 302 301 15m;
      proxy_cache_valid 404         10s;
      proxy_cache_valid any          1m;
        }

kill process running longer than (bash)….

At one our advertising server located far beyond in another galaxy 🙂 🙂 rsync over ssh begin to hung sometimes without any reason.
of course, we will try strace and other debug tools but tomorrow. Today we need quick fix sollution.

BTW
removing compression not help, and –timeout option rather not help solve this case
my rsync command:

 rsync --timeout=30 -apvre ssh -o StrictHostKeyChecking=no
       --remove-source-files /opt/logrsync/workdir/clicks/etl/20150707215235-SERVERNAME.pb.gz
       etl@etl2-1.SERVERNAME.it.randomthemes.com:~/etl/data/sources/clicks/

How to find rsync processes running more then 10 seconds, and kill them (bash):

while true; do
      for i in `ps -C rsync -o pid=,etimes= | awk '{if ($2 > 10) print $1}'`; do
       echo $i; kill $i; sleep 10;
      done
done

if somebody solve the same rsync issue – please, please tell me how!

In our case everything start working o.k. without any action, It was connectivity issue, but very very strange.
–timeout should help.