Solaris Zone Workload Plug-in Monitor
The Solaris Zone Workload Monitor is now available on the Grid
Solaris Zone Workload Plug-in Monitor
Started by
David Leith
, Jan 10 2008 01:41 PM
4 replies to this topic
#1
Posted 10 January 2008 - 01:41 PM
#2
Posted 16 March 2011 - 06:55 PM
Ok, I just got this setup on my solaris uptime server.
There were a couple of issues so I thought I would share with everyone.
First off when I would try and test the service monitor there would be an error that the tempfile could not be created.
The solution to this was that I needed to create a directory "/opt/uptime/tmp" on the uptime server.
The next issue was little harder. So some of my agents were showing the data properly. Others were not showing the memory. And finally some were only showing a count of the zones.
These issues were actually two seperate problems.
The first one it looks like it was attempted to be resolved but there were issues with it. Uptime requires the results for persistant data to be numerical only. Because the output of prstat -Z from some of my servers have the memory in "GB" the data passed would have a "G" in it. I also noticed that if the results passed back contained an "M" it was simply stripped off so the results then were off by a factor of 1024.
The second issue was that the design of the script was to execute prstat -Z and then store the output and then kill the prstat after 2 seconds. I have some servers that are busy enough that the output of prstat would not return within the 2 second timeout. I changed the script to execute prstat -Z for only one iteration and use that output.
Here is the script with my changes applied.
Hope others find this useful..
There were a couple of issues so I thought I would share with everyone.
First off when I would try and test the service monitor there would be an error that the tempfile could not be created.
The solution to this was that I needed to create a directory "/opt/uptime/tmp" on the uptime server.
The next issue was little harder. So some of my agents were showing the data properly. Others were not showing the memory. And finally some were only showing a count of the zones.
These issues were actually two seperate problems.
The first one it looks like it was attempted to be resolved but there were issues with it. Uptime requires the results for persistant data to be numerical only. Because the output of prstat -Z from some of my servers have the memory in "GB" the data passed would have a "G" in it. I also noticed that if the results passed back contained an "M" it was simply stripped off so the results then were off by a factor of 1024.
The second issue was that the design of the script was to execute prstat -Z and then store the output and then kill the prstat after 2 seconds. I have some servers that are busy enough that the output of prstat would not return within the 2 second timeout. I changed the script to execute prstat -Z for only one iteration and use that output.
Here is the script with my changes applied.
CODE
#!/bin/ksh
#set -x
# a shell script to display basic workload information on the various zones on this system
# based off of prstat -Z and zoneadm
# ideal output is like so
# zones_running 5
# css1\.cpu 5
# css1\.mem 1000
# css1\.rss 2000
# css2\.cpu 25
# css2\.mem 2000
# css2\.rss 8000
# css3\.cpu 40
# css3\.mem 3000
# css3\.rss 7500
AWKBIN="/usr/bin/nawk"
SEDBIN="/usr/bin/sed"
PRSTAT="/usr/bin/prstat -Z 1 1"
ZONEADM="/usr/sbin/zoneadm list -iv"
# first add up the running zones
ZR=`$ZONEADM | grep running | wc -l`
echo "zones_running $ZR"
# Original design used two temp files and would timeout if the server was busy
# changed to not rely on tempfiles and not timeout
$PRSTAT | grep -v 'ZONEID' | grep -v 'PID' | $AWKBIN '{if (NF=="8") print $0;}' | while read ZONEID NPROC SIZE RSS MEMORY TIME CPU ZONE; do
CPU=`echo $CPU | $SEDBIN s/%//`
SIZE=`echo $SIZE | $AWKBIN '{ V=substr($1,0,length($1)-1); M=substr($1,length($1),1); if (M=="G") { print V*1024*1024 } else { if (M=="M") { print V*1024 } else { print $1 } } }'`
RSS=`echo $RSS | $AWKBIN '{ V=substr($1,0,length($1)-1); M=substr($1,length($1),1); if (M=="G") { print V*1024*1024 } else { if (M=="M") { print V*1024 } else { print $1 } } }'`
echo ${ZONE}.cpu $CPU
echo ${ZONE}.mem $SIZE
echo ${ZONE}.rss $RSS
echo ${ZONE}.procs $NPROC
done
#set -x
# a shell script to display basic workload information on the various zones on this system
# based off of prstat -Z and zoneadm
# ideal output is like so
# zones_running 5
# css1\.cpu 5
# css1\.mem 1000
# css1\.rss 2000
# css2\.cpu 25
# css2\.mem 2000
# css2\.rss 8000
# css3\.cpu 40
# css3\.mem 3000
# css3\.rss 7500
AWKBIN="/usr/bin/nawk"
SEDBIN="/usr/bin/sed"
PRSTAT="/usr/bin/prstat -Z 1 1"
ZONEADM="/usr/sbin/zoneadm list -iv"
# first add up the running zones
ZR=`$ZONEADM | grep running | wc -l`
echo "zones_running $ZR"
# Original design used two temp files and would timeout if the server was busy
# changed to not rely on tempfiles and not timeout
$PRSTAT | grep -v 'ZONEID' | grep -v 'PID' | $AWKBIN '{if (NF=="8") print $0;}' | while read ZONEID NPROC SIZE RSS MEMORY TIME CPU ZONE; do
CPU=`echo $CPU | $SEDBIN s/%//`
SIZE=`echo $SIZE | $AWKBIN '{ V=substr($1,0,length($1)-1); M=substr($1,length($1),1); if (M=="G") { print V*1024*1024 } else { if (M=="M") { print V*1024 } else { print $1 } } }'`
RSS=`echo $RSS | $AWKBIN '{ V=substr($1,0,length($1)-1); M=substr($1,length($1),1); if (M=="G") { print V*1024*1024 } else { if (M=="M") { print V*1024 } else { print $1 } } }'`
echo ${ZONE}.cpu $CPU
echo ${ZONE}.mem $SIZE
echo ${ZONE}.rss $RSS
echo ${ZONE}.procs $NPROC
done
Hope others find this useful..
#3
Posted 22 February 2012 - 02:42 PM
Hi,
I'm having some fun with this plugin on Uptime v6. I like your mods from the original and I'd like to suggest some adjustments.
If we change PRSTAT="prstat -Z -n1,20 1 1" we get a slightly more useful set of output, especially when there are lots of zones.
My version of the script now looks like this in terms of the grunt work part:
PRSTAT="prstat -Z -n1,20 1 1"
ZONEADM="/usr/sbin/zoneadm list -iv"
# first add up the running zones
ZR=`$ZONEADM | grep running | wc -l`
echo "zones_running $ZR"
# now the trickey part
$PRSTAT | grep -v 'ZONEID' | grep -v 'PID' | grep -v "\/" | grep -v Total | while read ZONEID NPROC SIZE RSS MEMORY TIME CPU ZONE; do
# Dump the last character so we just have numbers and no M characters or % signs.
CPU=`echo $CPU |sed 's/\(.*\)./\1/'`
SIZE=`echo $SIZE |sed 's/\(.*\)./\1/'`
RSS=`echo $RSS |sed 's/\(.*\)./\1/'`
MEM=`echo $MEMORY |sed 's/\(.*\)./\1/'`
echo ${ZONE}.cpu $CPU
echo ${ZONE}.mem $SIZE
echo ${ZONE}.rss $RSS
echo ${ZONE}.memory $MEM
echo ${ZONE}.procs $NPROC
done
I freely admit there is probably a more efficient way of doing this but it produces the numbers in a useful format and can deal with up to 20 zones on a box.
Cheers
Joe
I'm having some fun with this plugin on Uptime v6. I like your mods from the original and I'd like to suggest some adjustments.
If we change PRSTAT="prstat -Z -n1,20 1 1" we get a slightly more useful set of output, especially when there are lots of zones.
My version of the script now looks like this in terms of the grunt work part:
PRSTAT="prstat -Z -n1,20 1 1"
ZONEADM="/usr/sbin/zoneadm list -iv"
# first add up the running zones
ZR=`$ZONEADM | grep running | wc -l`
echo "zones_running $ZR"
# now the trickey part
$PRSTAT | grep -v 'ZONEID' | grep -v 'PID' | grep -v "\/" | grep -v Total | while read ZONEID NPROC SIZE RSS MEMORY TIME CPU ZONE; do
# Dump the last character so we just have numbers and no M characters or % signs.
CPU=`echo $CPU |sed 's/\(.*\)./\1/'`
SIZE=`echo $SIZE |sed 's/\(.*\)./\1/'`
RSS=`echo $RSS |sed 's/\(.*\)./\1/'`
MEM=`echo $MEMORY |sed 's/\(.*\)./\1/'`
echo ${ZONE}.cpu $CPU
echo ${ZONE}.mem $SIZE
echo ${ZONE}.rss $RSS
echo ${ZONE}.memory $MEM
echo ${ZONE}.procs $NPROC
done
I freely admit there is probably a more efficient way of doing this but it produces the numbers in a useful format and can deal with up to 20 zones on a box.
Cheers
Joe
QUOTE (Towster @ Mar 16 2011, 06:55 PM) <{POST_SNAPBACK}>
Ok, I just got this setup on my solaris uptime server.
There were a couple of issues so I thought I would share with everyone.
First off when I would try and test the service monitor there would be an error that the tempfile could not be created.
The solution to this was that I needed to create a directory "/opt/uptime/tmp" on the uptime server.
The next issue was little harder. So some of my agents were showing the data properly. Others were not showing the memory. And finally some were only showing a count of the zones.
These issues were actually two seperate problems.
The first one it looks like it was attempted to be resolved but there were issues with it. Uptime requires the results for persistant data to be numerical only. Because the output of prstat -Z from some of my servers have the memory in "GB" the data passed would have a "G" in it. I also noticed that if the results passed back contained an "M" it was simply stripped off so the results then were off by a factor of 1024.
The second issue was that the design of the script was to execute prstat -Z and then store the output and then kill the prstat after 2 seconds. I have some servers that are busy enough that the output of prstat would not return within the 2 second timeout. I changed the script to execute prstat -Z for only one iteration and use that output.
Here is the script with my changes applied.
Hope others find this useful..
There were a couple of issues so I thought I would share with everyone.
First off when I would try and test the service monitor there would be an error that the tempfile could not be created.
The solution to this was that I needed to create a directory "/opt/uptime/tmp" on the uptime server.
The next issue was little harder. So some of my agents were showing the data properly. Others were not showing the memory. And finally some were only showing a count of the zones.
These issues were actually two seperate problems.
The first one it looks like it was attempted to be resolved but there were issues with it. Uptime requires the results for persistant data to be numerical only. Because the output of prstat -Z from some of my servers have the memory in "GB" the data passed would have a "G" in it. I also noticed that if the results passed back contained an "M" it was simply stripped off so the results then were off by a factor of 1024.
The second issue was that the design of the script was to execute prstat -Z and then store the output and then kill the prstat after 2 seconds. I have some servers that are busy enough that the output of prstat would not return within the 2 second timeout. I changed the script to execute prstat -Z for only one iteration and use that output.
Here is the script with my changes applied.
CODE
#!/bin/ksh
#set -x
# a shell script to display basic workload information on the various zones on this system
# based off of prstat -Z and zoneadm
# ideal output is like so
# zones_running 5
# css1\.cpu 5
# css1\.mem 1000
# css1\.rss 2000
# css2\.cpu 25
# css2\.mem 2000
# css2\.rss 8000
# css3\.cpu 40
# css3\.mem 3000
# css3\.rss 7500
AWKBIN="/usr/bin/nawk"
SEDBIN="/usr/bin/sed"
PRSTAT="/usr/bin/prstat -Z 1 1"
ZONEADM="/usr/sbin/zoneadm list -iv"
# first add up the running zones
ZR=`$ZONEADM | grep running | wc -l`
echo "zones_running $ZR"
# Original design used two temp files and would timeout if the server was busy
# changed to not rely on tempfiles and not timeout
$PRSTAT | grep -v 'ZONEID' | grep -v 'PID' | $AWKBIN '{if (NF=="8") print $0;}' | while read ZONEID NPROC SIZE RSS MEMORY TIME CPU ZONE; do
CPU=`echo $CPU | $SEDBIN s/%//`
SIZE=`echo $SIZE | $AWKBIN '{ V=substr($1,0,length($1)-1); M=substr($1,length($1),1); if (M=="G") { print V*1024*1024 } else { if (M=="M") { print V*1024 } else { print $1 } } }'`
RSS=`echo $RSS | $AWKBIN '{ V=substr($1,0,length($1)-1); M=substr($1,length($1),1); if (M=="G") { print V*1024*1024 } else { if (M=="M") { print V*1024 } else { print $1 } } }'`
echo ${ZONE}.cpu $CPU
echo ${ZONE}.mem $SIZE
echo ${ZONE}.rss $RSS
echo ${ZONE}.procs $NPROC
done
#set -x
# a shell script to display basic workload information on the various zones on this system
# based off of prstat -Z and zoneadm
# ideal output is like so
# zones_running 5
# css1\.cpu 5
# css1\.mem 1000
# css1\.rss 2000
# css2\.cpu 25
# css2\.mem 2000
# css2\.rss 8000
# css3\.cpu 40
# css3\.mem 3000
# css3\.rss 7500
AWKBIN="/usr/bin/nawk"
SEDBIN="/usr/bin/sed"
PRSTAT="/usr/bin/prstat -Z 1 1"
ZONEADM="/usr/sbin/zoneadm list -iv"
# first add up the running zones
ZR=`$ZONEADM | grep running | wc -l`
echo "zones_running $ZR"
# Original design used two temp files and would timeout if the server was busy
# changed to not rely on tempfiles and not timeout
$PRSTAT | grep -v 'ZONEID' | grep -v 'PID' | $AWKBIN '{if (NF=="8") print $0;}' | while read ZONEID NPROC SIZE RSS MEMORY TIME CPU ZONE; do
CPU=`echo $CPU | $SEDBIN s/%//`
SIZE=`echo $SIZE | $AWKBIN '{ V=substr($1,0,length($1)-1); M=substr($1,length($1),1); if (M=="G") { print V*1024*1024 } else { if (M=="M") { print V*1024 } else { print $1 } } }'`
RSS=`echo $RSS | $AWKBIN '{ V=substr($1,0,length($1)-1); M=substr($1,length($1),1); if (M=="G") { print V*1024*1024 } else { if (M=="M") { print V*1024 } else { print $1 } } }'`
echo ${ZONE}.cpu $CPU
echo ${ZONE}.mem $SIZE
echo ${ZONE}.rss $RSS
echo ${ZONE}.procs $NPROC
done
Hope others find this useful..
#4
Posted 06 June 2012 - 07:07 AM
Hi,
I like your modifications, but i've had some issues with the outputs for Zones which are using GB of Memories.
In fact, the output of PRSTAT of a zone named myzone1 which is using for example 14GB of Memories with 13GB of RSS will be like
myzone1.cpu 2.0
myzone1.mem 14
myzone1.rss 13
myzone1.memory 4.9
myzone1.procs 248
So i've change the way Memory and RSS outputs are managed
Below is the core part of the modified script,
PRSTAT="prstat -Z -n1,20 1 1"
ZONEADM="/usr/sbin/zoneadm list -iv"
# first add up the running zones
ZR=`$ZONEADM | grep running | wc -l`
echo "zones_running $ZR"
# now the tricky part
$PRSTAT | grep -v 'ZONEID' | grep -v 'PID' | grep -v "\/" | grep -v Total | while read ZONEID NPROC SIZE RSS MEMORY TIME CPU ZONE; do
# Dump the last character so we just have numbers and no M characters or % signs.
CPU=`echo $CPU |sed 's/\(.*\)./\1/'`
# If Memory Size is in GB convert in MB and then dump the last character
if echo $SIZE | grep G > /dev/null 2<&1
then
SIZE=`echo $SIZE |sed 's/\(.*\)./\1/'`
SIZE=$(($SIZE * 1024))
else
SIZE=`echo $SIZE |sed 's/\(.*\)./\1/'`
fi
# If RSS Size is in GB convert in MB and then dump the last character
if echo $RSS | grep G > /dev/null 2<&1
then
RSS=`echo $RSS |sed 's/\(.*\)./\1/'`
RSS=$(($RSS * 1024))
else
RSS=`echo $RSS |sed 's/\(.*\)./\1/'`
fi
MEM=`echo $MEMORY |sed 's/\(.*\)./\1/'`
echo ${ZONE}.cpu $CPU
echo ${ZONE}.mem $SIZE
echo ${ZONE}.rss $RSS
echo ${ZONE}.memory $MEM
echo ${ZONE}.procs $NPROC
done
exit 0
With this version, i've now the following output for the same zone
myzone1.cpu 2.0
myzone1.mem 14336
myzone1.rss 13312
myzone1.memory 4.9
myzone1.procs 248
Hope others find this version useful
I like your modifications, but i've had some issues with the outputs for Zones which are using GB of Memories.
In fact, the output of PRSTAT of a zone named myzone1 which is using for example 14GB of Memories with 13GB of RSS will be like
myzone1.cpu 2.0
myzone1.mem 14
myzone1.rss 13
myzone1.memory 4.9
myzone1.procs 248
So i've change the way Memory and RSS outputs are managed
Below is the core part of the modified script,
CODE
PRSTAT="prstat -Z -n1,20 1 1"
ZONEADM="/usr/sbin/zoneadm list -iv"
# first add up the running zones
ZR=`$ZONEADM | grep running | wc -l`
echo "zones_running $ZR"
# now the tricky part
$PRSTAT | grep -v 'ZONEID' | grep -v 'PID' | grep -v "\/" | grep -v Total | while read ZONEID NPROC SIZE RSS MEMORY TIME CPU ZONE; do
# Dump the last character so we just have numbers and no M characters or % signs.
CPU=`echo $CPU |sed 's/\(.*\)./\1/'`
# If Memory Size is in GB convert in MB and then dump the last character
if echo $SIZE | grep G > /dev/null 2<&1
then
SIZE=`echo $SIZE |sed 's/\(.*\)./\1/'`
SIZE=$(($SIZE * 1024))
else
SIZE=`echo $SIZE |sed 's/\(.*\)./\1/'`
fi
# If RSS Size is in GB convert in MB and then dump the last character
if echo $RSS | grep G > /dev/null 2<&1
then
RSS=`echo $RSS |sed 's/\(.*\)./\1/'`
RSS=$(($RSS * 1024))
else
RSS=`echo $RSS |sed 's/\(.*\)./\1/'`
fi
MEM=`echo $MEMORY |sed 's/\(.*\)./\1/'`
echo ${ZONE}.cpu $CPU
echo ${ZONE}.mem $SIZE
echo ${ZONE}.rss $RSS
echo ${ZONE}.memory $MEM
echo ${ZONE}.procs $NPROC
done
exit 0
With this version, i've now the following output for the same zone
myzone1.cpu 2.0
myzone1.mem 14336
myzone1.rss 13312
myzone1.memory 4.9
myzone1.procs 248
Hope others find this version useful
#5
Posted 06 June 2012 - 06:52 PM
Hey guys;
Just to let you know, we've made some updates to the Solaris Zone Workload monitor and the agent-side script by default is the one above from Steve Esso. We also made a bunch of updates to all the files, changed the monitoring station scripts to use PHP instead of Perl (so Perl is not a requirement anymore), and made some important updates to the monitor definition (XML).
You can find it here:
http://support.uptim...w.php?mod_id=24
Just to let you know, we've made some updates to the Solaris Zone Workload monitor and the agent-side script by default is the one above from Steve Esso. We also made a bunch of updates to all the files, changed the monitoring station scripts to use PHP instead of Perl (so Perl is not a requirement anymore), and made some important updates to the monitor definition (XML).
You can find it here:
http://support.uptim...w.php?mod_id=24
Joel Pereira
Solutions Architect
uptime software ...because downtime is not an option
Solutions Architect
uptime software ...because downtime is not an option
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users











