My 2nd Head

Posts

Showing posts from 2011

Splunking Oracle's ZFS Appliance Part II

October 12, 2011

In my first part I wrote about storing long term analytics data in Splunk. Wouldn't it be nice to also have storage capacity tracked with Splunk? This is how it's done: 1. Get pool properties #!/bin/ksh # list capacity all Pools in a System to ${outputdir}/${poolname}.pools.log # Example: listPools.ksh /tmp 10.16.5.14 typeset outputdir=$1 typeset ipname=$2 typeset debug=$3 typeset user=monitor if [ -z "$1" -o -z "$2" ]; then printf "\nUsage: $0 <output Dir≷ <ZFSSA ipname> [ debug ]\n\n" exit 1 fi mkdir -p ${outputdir} dat=$(date +'%y-%m-%d %H:%M:%S') ssh -T ${user}@${ipname} << --EOF-- > ${outputdir}/${ipname}.pools.log script run('status'); run('storage'); var poollist=list(); printf("Time,pool,avail,compression,used,space_percentage\\n"); for(var k=0; k&lt:poollist.length; k++) { run('select ' + poollist[k]); var space_used=get(...

Splunking Oracle's ZFS Appliance

October 07, 2011

We have a bunch of Oracle ZFS Appliances. What I really like is their integrated dtrace based analytics feature. However, some things are missing or causing problems: -Storing long-term analytics data on the appliances produces a lot of data on the internal disks. This can fill up your appliance and in the worst case slow down the appliance software -Scaling the timeline out too much, makes peaks invisible. This is probably a problem of the rendering software used on the appliance (JavaScript) -Comparing all our appliances is not possible. There is no central analytics console. As we are a heavy Splunk user, I sat together with our friendly storage consultant from Oracle and we brought these two great products closer together: This is how we did it: 1. Setting up analytics worksheets First we had to create the analytics worksheets. This is best done using the CLI interface, as the order of drilldowns should be always the same. Otherwise fields in the gener...

Splunk: Unscaling units

May 22, 2011

I'm working on a Splunk Application for Solaris. One of the commands that is of interest to me is the fsstat(1m) command output. Here's the output for two filesystem types (zfs, nfs4): solaris# fsstat zfs nfs4 1 1 new name name attr attr lookup rddir read read write write file remov chng get set ops ops ops bytes ops bytes 2.21K 881 521 585K 1.22K 1.71M 9.34K 1.66M 21.3G 765K 10.7G zfs 0 0 0 0 0 0 0 0 0 0 0 nfs4 0 0 0 20 0 ...

Adjusting ZFS resilvering speed

March 13, 2011

There are two kernel parameters that can be adjusted if ZFS resilvering speed is too slow/fast: zfs_resilver_delay /* number of ticks to delay resilver */ and zfs_resilver_min_time_ms /* min millisecs to resilver per txg */ In some cases the values can be too low or two high (e.g. when using Mirroring vs. RAIDZ). A boost could be: # echo zfs_resilver_delay/W0|mdb -kw # echo zfs_resilver_min_time_ms/W0t3000|mdb -kw whereas a handbrake is e.g.: # echo zfs_resilver_delay/W2|mdb -kw # echo zfs_resilver_min_time_ms/W0t300|mdb -kw Disclaimer: Use at your own risk. Do not try on production systems without contacting support first.

Useful Dtrace One-Liners....

March 01, 2011

Finding write operations for a process. Especially when writing to a NFS share... # dtrace -n 'fsinfo:::write /execname == " execname "/ \ { printf("%s", args[0]->fi_pathname) }' Finding the top userland stacks for a process # dtrace -n 'syscall:::entry /execname == " execname "/ \ { @[ustack()] = count();}' Finding the same for a certain system call # dtrace -n 'syscall::mmap:entry /execname == " execname "/ \ { @[ustack()] = count();}'

New ZFS Appliances...

March 01, 2011

Measuring Read/Write durations with DTrace

January 21, 2011

I had the situation, where I wanted to see if read/write operations take too much time. It was something that I thought could be done easily with DTrace. Unfortunately, my DTrace skills are a bit rusty, so I contacted my personal DTrace guru Javier, who gave me a script. Here is the script for read operations: slow_read.d: #!/usr/sbin/dtrace -s #pragma D option quiet #pragma D option switchrate=10hz syscall::*read:entry { this->filistp = curthread->t_procp->p_user.u_finfo.fi_list; this->ufentryp = (uf_entry_t *)((uint64_t)this->filistp + (uint64_t)arg0 * (uint64_t)sizeof (uf_entry_t)); this->filep = this->ufentryp->uf_file; self->offset = this->filep->f_offset; this->vnodep = this->filep != 0 ? this->filep->f_vnode : 0; ...