From bzajac at geostaff.com Tue Oct 5 15:13:57 1999 From: bzajac at geostaff.com (Blair Zajac) Date: Tue, 05 Oct 1999 15:13:57 -0700 Subject: [Orca-announce] New orcallator.se - 1.15 Message-ID: <37FA7825.A38937A5@geostaff.com> Hello, Here's a new release of orcallator.se. It does the following changes since my last release (1.13). // Version 1.15: Oct 5, 1999 kvm$mpid is a int not a long. This caused // problems on Solaris 7 hosts running a 64 // bit kernel. // Version 1.14: Oct 1, 1999 Rename disk.c?t?d? column names to // disk_runp.c?t?d? to better reflect the data // being recorded and to allow for more per disk // information later. I have also placed it on my website at http://www.geocities.com/~bzking/ Blair -------------- next part -------------- // // Orcallator.se, a log generating performance monitor. // // This program logs many different system quantities to a log file // for later processing. // // Author: Blair Zajac . // // Portions copied from percollator.se written by Adrian Cockroft. // // Version 1.15: Oct 5, 1999 kvm$mpid is a int not a long. This caused // problems on Solaris 7 hosts running a 64 // bit kernel. // Version 1.14: Oct 1, 1999 Rename disk.c?t?d? column names to // disk_runp.c?t?d? to better reflect the data // being recorded and to allow for more per disk // information later. // Version 1.13: Sep 24, 1999 Fix a bug in the disk_mean calculation where // it was being divided by the wrong disk_count. // Now it should be much larger and in scale with // disk_peak. When WATCH_DISK is defined, now // print each disk's run percent. Add a new // define WATCH_MOUNTS, which reports each mount // point's disk space and inode capacity, usage, // available for non-root users and percent used. // This comes from Duncan Lawie tyger at hoopoes.com. // Add some smarts so that if the number of // interfaces, physical disk, or mounted // partitions changes, then a new header is // printed. This will prevent column name and // data mixups when the system configuration // changes. // Version 1.12: Sep 14, 1999 Add the page scan rate as scanrate in // measure_cpu. // Version 1.11: Aug 13, 1999 Add the number of CPUs as ncpus. Move // measure_disk and measure_ram sooner in the // list of subsystems to handle. Increase the // number of characters for each network // interface from four to five. Add new disk // reads, writes, Kbytes read, and Kbytes // written per second. Add number of bytes // of free memory in bytes as freememK. // Version 1.10: Jul 28, 1999 Measure the process spawn rate if WATCH_CPU // is defined and the user is root. // Version 1.9: Jun 2, 1999 If WATCH_YAHOO is defined, then process the // access log as a Yahoo! style access log. // Restructure the code to handle different // web server access log formats. // Version 1.8: Jun 1, 1999 If the environmental variable WEB_SERVER is // defined, use its value of the as the name // of the process to count for the number of // web servers on the system. If WEB_SERVER // is not defined, then count number of httpd's. // Version 1.7: Mar 25, 1999 Simplify and speed up count_proc by 20%. // Version 1.6: Feb 23, 1999 Print pvm.user_time and system_time correctly. // Version 1.5: Feb 23, 1999 Always write header to a new file. // Version 1.4: Feb 19, 1999 Handle missing HTTP/1.x in access log. // Version 1.3: Feb 18, 1999 On busy machines httpops5 will be enlarged. // Version 1.2: Feb 18, 1999 Output data on integral multiples of interval. // Version 1.1: Feb 18, 1999 Integrate Squid log processing from SE 3.1. // Version 1.0: Sep 9, 1998 Initial version. // // The default sampling interval in seconds. #define SAMPLE_INTERVAL 300 // The maximum number of colums of data. #define MAX_COLUMNS 512 // Define the different parts of the system you want to examine. #ifdef WATCH_OS #define WATCH_CPU 1 #define WATCH_MUTEX 1 #define WATCH_NET 1 #define WATCH_TCP 1 #define WATCH_NFS 1 #define WATCH_MOUNTS 1 #define WATCH_DISK 1 #define WATCH_DNLC 1 #define WATCH_INODE 1 #define WATCH_RAM 1 #define WATCH_PAGES 1 #endif #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef WATCH_MOUNTS #include #include #endif #if WATCH_CPU || WATCH_WEB #include #ifdef WATCH_CPU // This is the maximum pid on Solaris hosts. #define DEFAULT_MAXPID 30000 #include #endif #ifdef WATCH_WEB #include // Define this macro which returns the size index for a file of a // particular size. This saves the overhead of a function call. #define WWW_SIZE_INDEX(size, size_index) \ if (size < 1024) { \ size_index=0; /* under 1KB */ \ } else { \ if (size < 10240) { \ size_index=1; /* under 10K */ \ } else { \ if (size < 102400) { \ size_index=2; /* under 100KB */ \ } else { \ if (size < 1048576) { \ size_index=3; /* under 1MB */ \ } else { \ size_index=4; /* over 1MB */ \ } \ } \ } \ } \ dwnld_size[size_index]++; // Handle the reply code from the server. #define WWW_REPLY_CODE(word) \ if (word != nil) { \ if (word == "304") { \ httpop_condgets++; \ } \ else { \ first_byte = word; \ if (first_byte[0] == '4' || first_byte[0] == '5') { \ httpop_errors++; \ } \ } \ } // Handle the method of the object served. This define only works with // non-proxy servers. #define WWW_METHOD1(word) \ if (word != nil) { \ switch (word) { \ case "get": \ case "GET": \ httpop_gets++; \ break; \ case "post": \ case "POST": \ httpop_posts++; \ break; \ case "head": \ case "HEAD": \ ishead = 1; \ httpop_condgets++; \ break; #ifdef WATCH_SQUID #define WWW_METHOD2 \ case "icp_query": \ case "ICP_QUERY": \ squid_icp_queries++; \ break; #else #define WWW_METHOD2 #endif #define WWW_METHOD_END \ default: \ break; \ } \ } #define WWW_METHOD(word) WWW_METHOD1(word) WWW_METHOD2 WWW_METHOD_END #endif #endif // Put all rules here so they can be accessed by the handle functions. lr_cpu_t lr_cpu$cpu; lr_cpu_t tmp_lrcpu; lr_mutex_t lr_mutex$m; lr_mutex_t tmp_mutex; lr_net_t lr_net$nr; lr_net_t tmp_nr; lr_tcp_t lr_tcp$tcp; lr_tcp_t tmp_lrtcp; #ifdef WATCH_TCP tcp tcp$tcp; tcp tmp_tcp; #endif lr_rpcclient_t lr_rpcclient$r; lr_rpcclient_t tmp_lrpcc; lr_disk_t lr_disk$dr; lr_disk_t tmp_dr; lr_dnlc_t lr_dnlc$dnlc; lr_dnlc_t tmp_lrdnlc; lr_inode_t lr_inode$inode; lr_inode_t tmp_lrinode; lr_ram_t lr_ram$ram; lr_ram_t tmp_lrram; #ifdef WATCH_PAGES ks_system_pages kstat$pages; ks_system_pages tmp_kstat_pages; #endif lr_swapspace_t lr_swapspace$s; lr_swapspace_t tmp_lrswap; lr_kmem_t lr_kmem$kmem; lr_kmem_t tmp_lrkmem; ks_system_misc kstat$misc; ks_system_misc tmp_kstat_misc; // Put application globals here. string nodename; // Name of this machine. string program_name; // Name of this program. int hz; // Clock tick rate. int page_size; // Page size in bytes. long boot_time; // Boot time of the system. long interval = SAMPLE_INTERVAL; // Sampling interval. #ifdef WATCH_CPU int can_read_kernel = 0; // If the kernel can be read. int kvm$mpid; // The last created PID. // These variables store the mpid before and after the standard interval. int mpid_previous; int mpid_current; ulonglong mpid_then; ulonglong mpid_now; // These variables store the mpid before and after 5 second intervals. int mpid5_previous; int mpid5_current; ulonglong mpid5_then; ulonglong mpid5_now; double mpid5_rate; #endif #ifdef WATCH_MOUNTS mnttab_t mnt$mnt; mnttab_t tmp_mnt; #endif // Variables for handling the httpd access log. #ifdef WATCH_WEB string www_server_proc_name = getenv("WEB_SERVER"); string search_url = getenv("SEARCHURL"); string www_log_filename = getenv("WEB_LOG"); string www_gateway = getenv("GATEWAY"); ulong www_fd; uint www_gatelen; stat_t www_stat[1]; ulong www_ino; long www_size; double www_interval; // Hi-res interval time. ulonglong www_then; ulonglong www_now; double www5_interval; // Actual hi-res 5 second interval. ulonglong www5_then; ulonglong www5_now; double httpops; double httpops5; double gateops; double dtmp; long httpop_gets; long httpop_condgets; // HEAD or code = 304 conditional get no data. long httpop_posts; long httpop_cgi_bins; long httpop_searches; long httpop_errors; long dwnld_size[5]; // [0] < 1K, [1] < 10K, [2] < 100K, [3] < 1M, [4] >= 1M long dwnld_totalz; // total size counted from log #if WATCH_PROXY || WATCH_SQUID || WATCH_YAHOO // If we're watching a Yahoo log, then take the transfer time to be the // processing time. double www_dwnld_time_sum; // transfer time double www_dwnld_time_by_size[5]; // mean transfer time by size bin #endif #if WATCH_PROXY || WATCH_SQUID long prxy_squid_indirect; // number of hits that go via PROXY,SOCKS,parent long prxy_squid_cache_hits; // number of hits returned from cache #endif #ifdef WATCH_PROXY long prxy_cache_writes; // number of writes and updates to cache long prxy_uncacheable; // number of explicitly uncacheable httpops // any extra is errors or incomplete ops #endif #ifdef WATCH_SQUID long squid_cache_misses; long squid_icp_requests; long squid_icp_queries; long squid_client_http; #endif #endif // Variables for handling output. ulong ofile; // File pointer to the logging file. string col_comment[MAX_COLUMNS]; // Comments for each column. string col_data[MAX_COLUMNS]; // Data for each column. int current_column = 0; // The current column. int print_header = 1; // Flag to flush header. // Send the stored columns of information to the output. print_columns(string data[]) { int i; for (i=0; i= MAX_COLUMNS) { fprintf(stderr, "%s: too many columns (%d). Increase MAX_COLUMNS.\n", program_name, current_column); exit(1); } col_comment[current_column] = comment; col_data[current_column] = data; ++current_column; } flush_output() { if (print_header != 0) { print_columns(col_comment); print_header = 0; } print_columns(col_data); current_column = 0; } // Sets ofile to the output file pointer. Creates or appends to logfile // if OUTDIR is set, otherwise sets the file pointer to STDOUT. It // start a new logfile each day. checkoutput(tm_t now) { string outdir = getenv("OUTDIR"); string outname; tm_t then; char tm_buf[32]; if (outdir == nil) { /* no output dir so use stdout */ if (ofile == 0) { /* first time, so print header and set ofile */ ofile = stdout; print_header = 1; } return; } /* maintain daily output logfiles in OUTDIR */ if (now.tm_yday != then.tm_yday) { /* first time or day has changed, start new logfile */ if (ofile != 0) { /* close existing output file */ fclose(ofile); } strftime(tm_buf, sizeof(tm_buf), "%Y-%m-%d", now); outname = sprintf("%s/percol-%s", outdir, tm_buf); ofile = fopen(outname, "a"); /* open for append either way */ if (ofile == 0) { perror("can't open output logfile"); exit(1); } // Always write header. print_header = 1; then = now; } } int main(int argc, string argv[]) { utsname_t u[1]; long now; long sleep_till; // Time to sleep to. tm_t tm_now; // Get the nodename of the machine. uname(u); nodename = u[0].nodename; program_name = argv[0]; // Handle the command line arguments. switch (argc) { case 1: break; case 2: interval = atoi(argv[1]); break; default: fprintf(stderr, "usage: se [Defines] %s [interval]\n", program_name); fprintf(stderr, "%s can use the following environmental variables:\n", program_name); fprintf(stderr, " setenv WEB_SERVER netscape\n"); fprintf(stderr, " setenv WEB_LOG /ns-home/httpd-80/logs/access - location of web server log\n"); fprintf(stderr, " setenv GATEWAY some.where.com - special address to monitor\n"); fprintf(stderr, " setenv OUTDIR /ns-home/docs/orcallator/logs - default stdout\n"); fprintf(stderr, " setenv SEARCHURL srch.cgi - match for search scripts, default is search.cgi\n"); fprintf(stderr, "Defines:\n"); fprintf(stderr, " -DWATCH_WEB watch web server access logs\n"); fprintf(stderr, " -DWATCH_PROXY use WEB_LOG as a NCSA style proxy log\n"); fprintf(stderr, " -DWATCH_SQUID use WEB_LOG as a Squid log\n"); fprintf(stderr, " -DWATCH_OS includes all of the below:\n"); fprintf(stderr, " -DWATCH_CPU watch the cpu load, run queue, etc\n"); fprintf(stderr, " -DWATCH_MUTEX watch the number of mutex spins\n"); fprintf(stderr, " -DWATCH_NET watch all Ethernet interfaces\n"); fprintf(stderr, " -DWATCH_TCP watch all the TCP/IP stack\n"); fprintf(stderr, " -DWATCH_NFS watch NFS requests\n"); fprintf(stderr, " -DWATCH_MOUNTS watch usage of mount points\n"); fprintf(stderr, " -DWATCH_DISK watch disk read/write usage\n"); fprintf(stderr, " -DWATCH_DNLC watch the directory name lookup cache\n"); fprintf(stderr, " -DWATCH_INODE watch the inode cache\n"); fprintf(stderr, " -DWATCH_RAM watch memory usage\n"); fprintf(stderr, " -DWATCH_PAGES watch where pages are allocated\n"); exit(1); break; } // Initialize the various structures. initialize(); // Run forever. If WATCH_WEB is defined, then have measure_web() // do the sleeping while it is watching the access log file until the // next update time for the whole operating system. Also, collect the // data from the access log file before printing any output. for (;;) { // Calculate the next time to sleep to that is an integer multiple of // the interval time. Make sure that at least half of the interval // passes before waking up. now = time(0); sleep_till = (now/interval)*interval; while (sleep_till < now + interval*0.5) { sleep_till += interval; } #ifdef WATCH_WEB measure_web(sleep_till); #else sleep_till_and_count_new_proceses(sleep_till); #endif // Get the current time. now = time(0); tm_now = localtime(&now); measure_os(now, tm_now); #ifdef WATCH_WEB put_httpd(); #endif // Get a file descriptor to write to. Maintains daily output files. checkoutput(tm_now); // Print the output. flush_output(); } return 0; } initialize() { #ifdef WATCH_CPU int i; #endif #ifdef WATCH_CPU // Initialize the process spawning rate measurement variables. // Determine if the kernel can be read to measure the last pid. i = open("/dev/kmem", O_RDONLY); if (i != -1) { close(i); can_read_kernel = 1; mpid_previous = kvm$mpid; mpid_then = gethrtime(); mpid_current = mpid_previous; mpid5_then = mpid_then; mpid5_previous = mpid_previous; mpid5_current = mpid_previous; mpid5_rate = 0; } #endif #ifdef WATCH_WEB // Initialize those variables that were not set with environmental // variables. if (search_url == nil || search_url == "") { search_url = "search.cgi"; } if (www_server_proc_name == nil || www_server_proc_name == "") { www_server_proc_name = "httpd"; } if (www_gateway == nil || www_gateway == "" ) { www_gateway = "NoGatway"; www_gatelen = 0; } else { www_gatelen = strlen(www_gateway); } // Initialize the web server watching variables. Move the file pointer // to the end of the web access log and note the current time. if (www_log_filename != nil) { www_fd = fopen(www_log_filename, "r"); if (www_fd != 0) { stat(www_log_filename, www_stat); www_ino = www_stat[0].st_ino; www_size = www_stat[0].st_size; // Move to the end of the file. fseek(www_fd, 0, 2); } } www_then = gethrtime(); www5_then = www_then; #endif // Sleep to give the disks a chance to update. sleep(DISK_UPDATE_RATE); // Get the clock tick rate. hz = sysconf(_SC_CLK_TCK); // Get the page size. page_size = sysconf(_SC_PAGESIZE); // Calculate the system boot time. boot_time = time(0) - (kstat$misc.clk_intr / hz); // Perform the first measurement of the system. _measure_os(); } // Measure the system statistics all at once. _measure_os() { tmp_lrcpu = lr_cpu$cpu; tmp_mutex = lr_mutex$m; tmp_nr = lr_net$nr; tmp_lrtcp = lr_tcp$tcp; #ifdef WATCH_TCP tmp_tcp = tcp$tcp; #endif tmp_lrpcc = lr_rpcclient$r; tmp_dr = lr_disk$dr; tmp_lrdnlc = lr_dnlc$dnlc; tmp_lrinode = lr_inode$inode; tmp_lrram = lr_ram$ram; #ifdef WATCH_PAGES tmp_kstat_pages = kstat$pages; #endif tmp_lrswap = lr_swapspace$s; tmp_lrkmem = lr_kmem$kmem; tmp_kstat_misc = kstat$misc; } measure_os(long now, tm_t tm_now) { // Measure the system now. _measure_os(); // Take care of miscellaneous measurements. measure_misc(now, tm_now); // Take care of cpu. #ifdef WATCH_CPU measure_cpu(); #endif // Take care of mutexes. #ifdef WATCH_MUTEX measure_mutex(); #endif // Take care of mount pointes. #ifdef WATCH_MOUNTS measure_mounts(); #endif // Take care of the disks. #ifdef WATCH_DISK measure_disk(); #endif // Take care of ram. #ifdef WATCH_RAM measure_ram(); #endif // Take care of the network. #ifdef WATCH_NET measure_net(); #endif // Take care of TCP/IP. #ifdef WATCH_TCP measure_tcp(); #endif // Take care of NFS. #ifdef WATCH_NFS measure_nfs(); #endif // Take care of DNLC. #ifdef WATCH_DNLC measure_dnlc(); #endif // Take care of the inode cache. #ifdef WATCH_INODE measure_inode(); #endif // Take care of page allocations. #ifdef WATCH_PAGES measure_pages(); #endif } /* state as a character */ char state_char(int state) { switch(state) { case ST_WHITE: return 'w'; /* OK states are lower case */ case ST_BLUE: return 'b'; case ST_GREEN: return 'g'; case ST_AMBER: return 'A'; /* bad states are upper case to stand out */ case ST_RED: return 'R'; case ST_BLACK: return 'B'; default: return 'I'; /* invalid state */ } } measure_misc(long now, tm_t tm_now) { long uptime; char states[12]; char tm_buf[16]; uptime = now - boot_time; states = "wwwwwwwwwww"; strftime(tm_buf, sizeof(tm_buf), "%T", tm_now); states[0] = state_char(lr_disk$dr.state); states[1] = state_char(lr_net$nr.state); states[2] = state_char(lr_rpcclient$r.state); states[3] = state_char(lr_swapspace$s.state); states[4] = state_char(lr_ram$ram.state); states[5] = state_char(lr_kmem$kmem.state); states[6] = state_char(lr_cpu$cpu.state); states[7] = state_char(lr_mutex$m.state); states[8] = state_char(lr_dnlc$dnlc.state); states[9] = state_char(lr_inode$inode.state); states[10]= state_char(lr_tcp$tcp.state); put_output(" timestamp", sprintf("%10d", now)); put_output("locltime", tm_buf); put_output("DNnsrkcmdit", states); put_output(" uptime", sprintf("%8d", uptime)); } sleep_till_and_count_new_proceses(long sleep_till) { long now; #ifdef WATCH_CPU long sleep_till1; int mpid5_diff; double mpid5_interval; double rate; #endif now = time(0); while (now < sleep_till) { #ifdef WATCH_CPU if (can_read_kernel != 0) { // Sleep at least 5 seconds to make a measurement. sleep_till1 = now + 5; while (now < sleep_till1) { sleep(sleep_till1 - now); now = time(0); } // Measure the 5 second process creation rate. mpid5_current = kvm$mpid; mpid5_now = gethrtime(); mpid5_interval = (mpid5_now - mpid5_then) * 0.000000001; mpid5_then = mpid5_now; if (mpid5_current >= mpid5_previous) { mpid5_diff = mpid5_current - mpid5_previous; } else { mpid5_diff = mpid5_current + DEFAULT_MAXPID - mpid5_previous; } rate = mpid5_diff/mpid5_interval; if (rate > mpid5_rate) { mpid5_rate = rate; } mpid5_previous = mpid5_current; // Now take these results to measure the long interval rate. // Because the mpid may flip over DEFAULT_MAXPID more than once // in the long interval time span, use the difference between // the previous and current mpid over a 5 second interval to // calculate the long interval difference. mpid_current += mpid5_diff; mpid_now = mpid5_now; } else { sleep(sleep_till - now); } #else sleep(sleep_till - now); #endif now = time(0); } } #ifdef WATCH_CPU measure_cpu() { p_vmstat pvm; double mpid_interval; double mpid_rate; pvm = vmglobal_total(); // In SE 3.0 user_time and system_time are int and in SE 3.1 they are // double, so cast everything to double using + 0.0. put_output(" usr%", sprintf("%5.1f", pvm.user_time + 0.0)); put_output(" sys%", sprintf("%5.1f", pvm.system_time + 0.0)); put_output(" 1runq", sprintf("%6.2f", tmp_kstat_misc.avenrun_1min/256.0)); put_output(" 5runq", sprintf("%6.2f", tmp_kstat_misc.avenrun_5min/256.0)); put_output("15runq", sprintf("%6.2f", tmp_kstat_misc.avenrun_15min/256.0)); put_output("#proc", sprintf("%5lu", tmp_kstat_misc.nproc)); put_output("scanrate", sprintf("%8.3f", pvm.scan)); // Calculate the rate of new process spawning. if (can_read_kernel != 0) { mpid_interval = (mpid_now - mpid_then) * 0.000000001; mpid_rate = (mpid_current - mpid_previous) / mpid_interval; put_output("#proc/s", sprintf("%7.3f", mpid_rate)); put_output("#proc/p5s", sprintf("%9.4f", mpid5_rate)); // Reset counters. mpid_then = mpid_now; mpid_previous = mpid_current; mpid5_rate = 0; } } #endif #ifdef WATCH_MUTEX measure_mutex() { put_output(" smtx", sprintf("%5d", tmp_mutex.smtx)); put_output("smtx/cpu", sprintf("%8d", tmp_mutex.smtx/tmp_mutex.ncpus)); put_output("ncpus", sprintf("%5d", tmp_mutex.ncpus)); } #endif #ifdef WATCH_NET measure_net() { int previous_count = -1; int current_count; int i; current_count = 0; for (i=0; i peak_disk_busy) { peak_disk_busy = GLOBAL_disk[i].run_percent; } } mean_disk_busy = mean_disk_busy/GLOBAL_disk_count; put_output("disk_peak", sprintf("%9.3f", peak_disk_busy)); put_output("disk_mean", sprintf("%9.3f", mean_disk_busy)); put_output("disk_rd/s", sprintf("%9.1f", total_reads)); put_output("disk_wr/s", sprintf("%9.1f", total_writes)); put_output("disk_rK/s", sprintf("%9.1f", total_readk)); put_output("disk_wK/s", sprintf("%9.1f", total_writek)); // If the number of disks has changed, say due to a add_drv, then print // new headers. if (previous_count != GLOBAL_disk_count) { print_header = 1; previous_count = GLOBAL_disk_count; } } #endif #ifdef WATCH_DNLC measure_dnlc() { put_output("dnlc_ref/s", sprintf("%10.3f", tmp_lrdnlc.refrate)); put_output("dnlc_hit%", sprintf("%9.3f", tmp_lrdnlc.hitrate)); } #endif #ifdef WATCH_INODE measure_inode() { put_output("inod_ref/s", sprintf("%10.3f", tmp_lrinode.refrate)); put_output("inod_hit%", sprintf("%9.3f", tmp_lrinode.hitrate)); put_output("inod_stl/s", sprintf("%10.3f", tmp_lrinode.iprate)); } #endif #ifdef WATCH_RAM measure_ram() { put_output("swap_avail", sprintf("%10ld", GLOBAL_pvm[0].swap_avail)); put_output("page_rstim", sprintf("%10d", tmp_lrram.restime)); put_output(" freememK", sprintf("%10d", GLOBAL_pvm[0].freemem)); put_output("free_pages", sprintf("%10d", (GLOBAL_pvm[0].freemem*1024)/page_size)); } #endif #ifdef WATCH_PAGES measure_pages() { put_output("pp_kernel", sprintf("%9lu", tmp_kstat_pages.pp_kernel)); put_output("pagesfree", sprintf("%9lu", tmp_kstat_pages.pagesfree)); put_output("pageslock", sprintf("%9lu", tmp_kstat_pages.pageslocked)); put_output("pagesio", sprintf("%7lu", tmp_kstat_pages.pagesio)); put_output("pagestotl", sprintf("%9lu", tmp_kstat_pages.pagestotal)); } #endif #ifdef WATCH_WEB // Breakdown access log format. accesslog(string buf) { int z; int size_index; int ishead; string word; char first_byte[1]; #if WATCH_PROXY || WATCH_SQUID || WATCH_YAHOO double xf; #ifdef WATCH_SQUID string logtag; string request; #endif #ifdef WATCH_YAHOO string arg; ulong ptr; ulong tmp; ulong ulong_xf; #endif #endif ishead = 0; #ifdef WATCH_YAHOO // Make sure that the input line has at least 32 bytes of data plus a new // line, for a total length of 33. if (strlen(buf) < 33) { return; } word = strtok(buf,"\05"); #else word = strtok(buf," "); #endif if (word == nil) { return; } #ifdef WATCH_SQUID // Word contains unix time in seconds.milliseconds. word = strtok(nil, " "); // Elapsed request time in ms xf = atof(word)/1000.0; www_dwnld_time_sum += xf; #ifdef DINKY printf("time: %s %f total %f\n", word, xf, xfer_sum); #endif word = strtok(nil, " "); // Client IP address logtag = strtok(nil, "/"); // Log tag word = strtok(nil, " "); // Reply code WWW_REPLY_CODE(word) word = strtok(nil, " "); // Size sent to client z = atoi(word); request = strtok(nil, " "); // Request method word = strtok(nil, " "); // URL if (word != nil) { if (word =~ "cgi-bin") { httpop_cgi_bins++; } if (word =~ search_url) { httpop_searches++; } } word = strtok(nil, " "); // Optional user ident word = strtok(nil, "/"); // Hierarchy if (word != nil) { if (word =~ "DIRECT") { prxy_squid_indirect++; } } word = strtok(nil, " "); // Hostname word = strtok(nil, " "); // Content-type // Process the collected data. if (logtag =~ "TCP") { squid_client_http++; } if (logtag =~ "UDP") { squid_icp_requests++; } if (logtag =~ "HIT") { prxy_squid_cache_hits++; } if (logtag =~ "MISS") { squid_cache_misses++; } WWW_METHOD(request) // Do not add size if its a HEAD. if (ishead == 0) { dwnld_totalz += z; } WWW_SIZE_INDEX(z, size_index) www_dwnld_time_by_size[size_index] += xf; #elif WATCH_YAHOO // Yahoo log format. Fields in square brackets will only appear in the // log file if the data actually exists (ie. you will never see a null // Referrer field). Further, fields labelled here with "(CONFIG)" will // only appear if they are enabled via the YahooLogOptions configuration // directive. // // IP Address (8 hex digits) // Timestamp (time_t as 8 hex digits) // Processing Time (in microseconds, as 8 hex digits) // Bytes Sent (8 hex digits) // URL // [^Er referrer] (CONFIG) // [^Em method] (CONFIG) // [^Es status_code] // ^Ed signature // \n // Ignore the IP address and timestamp. Get the processing time, the // number of bytes sent and the URL. For each portion of the line, split // it up into separate pieces. if (sscanf(word, "%8lx%8lx%8x%8x", &tmp, &tmp, &ulong_xf, &z) != 4) { return; } xf = ulong_xf/1000000.0; WWW_SIZE_INDEX(z, size_index) www_dwnld_time_sum += xf; www_dwnld_time_by_size[size_index] += xf; if (word =~ "cgi-bin") { httpop_cgi_bins++; } if (word =~ search_url) { httpop_searches++; } for (;;) { word = strtok(nil, "\05"); if (word == nil) { break; } first_byte = word; ptr = &word + 1; arg = ((string) ptr); ptr = 0; switch (first_byte[0]) { case 'm': WWW_METHOD(arg) ptr = 1; break; case 's': WWW_REPLY_CODE(arg) break; default: break; } } // If no method was seen, then assume it was a GET. if (ptr == 0) { httpop_gets++; } // Do not add size if its a HEAD. if (ishead == 0) { dwnld_totalz += z; } #else /* common or netscape proxy formats */ strtok(nil, " "); // - strtok(nil, " "); // - strtok(nil, " ["); // date strtok(nil, " "); // zone] word = strtok(nil, " \""); // GET or POST WWW_METHOD(word) word = strtok(nil, " "); // URL if (word != nil) { if (word =~ "cgi-bin") { httpop_cgi_bins++; } if (word =~ search_url) { httpop_searches++; } } // Sometimes HTTP/1.x is not listed in the access log. Skip it // if it does exist. Load the error/success code. word = strtok(nil, " "); if (word != nil && (word =~ "HTTP" || word =~ "http")) { word = strtok(nil, " "); } WWW_REPLY_CODE(word) word = strtok(nil, " "); // Bytes transferred. if (word != nil) { z = atoi(word); if (ishead == 0) { // Do not add size if its a HEAD. dwnld_totalz += z; } WWW_SIZE_INDEX(z, size_index) } #ifdef WATCH_PROXY word = strtok(nil, " "); // status from server word = strtok(nil, " "); // length from server word = strtok(nil, " "); // length from client POST word = strtok(nil, " "); // length POSTed to remote word = strtok(nil, " "); // client header req word = strtok(nil, " "); // proxy header resp word = strtok(nil, " "); // proxy header req word = strtok(nil, " "); // server header resp word = strtok(nil, " "); // transfer total secs word = strtok(nil, " "); // route // - DIRECT PROXY(host.domain:port) SOCKS if (word != nil) { if (strncmp(word, "PROXY", 5) == 0 || strncmp(word, "SOCKS", 5) == 0) { prxy_squid_indirect++; } } word = strtok(nil, " "); // client finish status word = strtok(nil, " "); // server finish status word = strtok(nil, " "); // cache finish status // ERROR HOST-NOT-AVAILABLE = error or incomplete op // WRITTEN REFRESHED CL-MISMATCH(content length mismatch) = cache_writes // NO-CHECK UP-TO-DATE = cache_hits // DO-NOT-CACHE NON-CACHEABLE = uncacheable if (word != nil) { switch(word) { case "WRITTEN": case "REFRESHED": case "CL-MISMATCH": prxy_cache_writes++; break; case "NO-CHECK": case "UP-TO-DATE": prxy_squid_cache_hits++; break; case "DO-NOT-CACHE": case "NON-CACHEABLE": prxy_uncacheable++; break; default: break; } } word = strtok(nil, " ["); // [transfer total time x.xxx if (word != nil) { xf = atof(word); www_dwnld_time_sum += xf; www_dwnld_time_by_size[size_index] += xf; } #endif #endif } measure_web(long sleep_till) { double lastops = 0.0; char buf[BUFSIZ]; int i; long now; httpops = 0.0; httpops5 = 0.0; gateops = 0.0; httpop_gets = 0; httpop_condgets = 0; httpop_posts = 0; httpop_cgi_bins = 0; httpop_errors = 0; httpop_searches = 0; for (i=0; i<5; i++) { dwnld_size[i] = 0; #if WATCH_PROXY || WATCH_SQUID || WATCH_YAHOO www_dwnld_time_by_size[i] = 0.0; #endif } dwnld_totalz = 0; #if WATCH_PROXY || WATCH_SQUID || WATCH_YAHOO www_dwnld_time_sum = 0.0; #endif #if WATCH_PROXY || WATCH_SQUID prxy_squid_indirect = 0; prxy_squid_cache_hits = 0; #ifdef WATCH_PROXY prxy_cache_writes = 0; prxy_uncacheable = 0; #else squid_cache_misses = 0; squid_icp_requests = 0; squid_icp_queries = 0; squid_client_http = 0; #endif #endif if (www_log_filename != nil) { now = time(0); while (now < sleep_till) { #ifdef WATCH_CPU sleep_till_and_count_new_proceses(now + 5); #else sleep(5); #endif now = time(0); if (www_fd != 0) { while (fgets(buf, BUFSIZ, www_fd) != nil) { httpops += 1.0; if (www_gatelen > 0) { if (strncmp(buf, www_gateway, www_gatelen) == 0) { gateops += 1.0; } } accesslog(buf); } } /* see if the file has been switched or truncated */ stat(www_log_filename, www_stat); if (www_ino != www_stat[0].st_ino || www_size > www_stat[0].st_size) { if (www_fd != 0) { fclose(www_fd); /* close the old log */ } /* log file has changed, open the new one */ www_fd = fopen(www_log_filename, "r"); if (www_fd != 0) { www_ino = www_stat[0].st_ino; while(fgets(buf, BUFSIZ, www_fd) != nil) { httpops += 1.0; if (www_gatelen > 0) { if (strncmp(buf, www_gateway, www_gatelen) == 0) { gateops += 1.0; } } accesslog(buf); } } } www5_now = gethrtime(); www5_interval = (www5_now - www5_then) * 0.000000001; www5_then = www5_now; dtmp = (httpops - lastops)/www5_interval; if (dtmp > httpops5) { httpops5 = dtmp; } lastops = httpops; // Remember size for next time. www_size = www_stat[0].st_size; } } else { sleep_till_and_count_new_proceses(sleep_till); www5_now = gethrtime(); } www_now = www5_now; www_interval = (www_now - www_then) * 0.000000001; www_then = www_now; // Use dtmp to get percentages. if (httpops == 0.0) { dtmp = 0.0; } else { dtmp = 100.0 / httpops; } #if WATCH_PROXY || WATCH_SQUID || WATCH_YAHOO for (i=0; i<5; i++) { if (dwnld_size[i] == 0) { www_dwnld_time_by_size[i] = 0.0; } else { www_dwnld_time_by_size[i] = www_dwnld_time_by_size[i]/dwnld_size[i]; } } #endif } int count_proc(string name) { int count; prpsinfo_t p; count = 0; for (p=first_proc(); p.pr_pid != -1; p=next_proc()) { if (p.pr_fname =~ name) { count++; } } return count; } put_httpd() { put_output("#httpds", sprintf("%7d", count_proc(www_server_proc_name))); put_output("httpop/s", sprintf("%8.2f", httpops/www_interval)); put_output("http/p5s", sprintf("%8.2f", httpops5)); put_output("cndget/s", sprintf("%8.2f", httpop_condgets/www_interval)); put_output("search/s", sprintf("%8.3f", httpop_searches/www_interval)); put_output(" cgi/s", sprintf("%8.3f", httpop_cgi_bins/www_interval)); put_output(" htErr/s", sprintf("%8.3f", httpop_errors/www_interval)); put_output(" httpb/s", sprintf("%8.0f", dwnld_totalz/www_interval)); put_output(" %to1KB", sprintf("%8.2f", dtmp*dwnld_size[0])); put_output(" %to10KB", sprintf("%8.2f", dtmp*dwnld_size[1])); put_output("%to100KB", sprintf("%8.2f", dtmp*dwnld_size[2])); put_output(" %to1MB", sprintf("%8.2f", dtmp*dwnld_size[3])); put_output("%over1MB", sprintf("%8.2f", dtmp*dwnld_size[4])); put_output(www_gateway, sprintf("%8.2f", gateops/www_interval)); #if WATCH_PROXY || WATCH_SQUID put_output(" %indir", sprintf("%8.2f", dtmp * prxy_squid_indirect)); put_output("%cch_hit", sprintf("%8.2f", dtmp * prxy_squid_cache_hits)); #ifdef WATCH_PROXY put_output("%cch_wrt", sprintf("%8.2f", dtmp * prxy_cache_writes)); put_output("%cch_unc", sprintf("%8.2f", dtmp * prxy_uncacheable)); #else put_output("%cch_mis", sprintf("%8.2f", dtmp * squid_cache_misses)); put_output("%cch_req", sprintf("%8.2f", dtmp * squid_icp_requests)); put_output("%cch_qry", sprintf("%8.2f", dtmp * squid_icp_queries)); #endif put_output(" xfr_t", sprintf("%8.2f", 0.01 * dtmp * www_dwnld_time_sum)); put_output(" xfr1_t", sprintf("%8.2f", www_dwnld_time_by_size[0])); put_output(" xfr10_t", sprintf("%8.2f", www_dwnld_time_by_size[1])); put_output("xfr100_t", sprintf("%8.2f", www_dwnld_time_by_size[2])); put_output(" xfr1M_t", sprintf("%8.2f", www_dwnld_time_by_size[3])); put_output("xfro1M_t", sprintf("%8.2f", www_dwnld_time_by_size[4])); #elif WATCH_YAHOO put_output(" wprc_t", sprintf("%9.5f", 0.01 * dtmp * www_dwnld_time_sum)); put_output(" wprc1_t", sprintf("%9.5f", www_dwnld_time_by_size[0])); put_output(" wprc10_t", sprintf("%9.5f", www_dwnld_time_by_size[1])); put_output("wprc100_t", sprintf("%9.5f", www_dwnld_time_by_size[2])); put_output(" wprc1M_t", sprintf("%9.5f", www_dwnld_time_by_size[3])); put_output("wprco1M_t", sprintf("%9.5f", www_dwnld_time_by_size[4])); #endif } #endif From bzajac at geostaff.com Thu Oct 7 13:14:41 1999 From: bzajac at geostaff.com (Blair Zajac) Date: Thu, 07 Oct 1999 13:14:41 -0700 Subject: [Orca-announce] New orcallator.se - 1.16 Message-ID: <37FCFF31.7D496159@geostaff.com> >From the code: // Version 1.16: Oct 7, 1999 To keep backwards compatibility, define // WATCH_WEB if WATCH_HTTPD is defined. // If the COMPRESSOR environmental variable // is defined, then when a new log file is opened // for a new day, the just closed log file is // compressed using the COMPESSOR command in the // following manner: // system(sprintf("%s %s &", COMPRESSOR, log_file) // COMPRESSOR should be set to something like // "gzip -9", or "compress", or "bzip2 -9". Blair -------------- next part -------------- // // Orcallator.se, a log generating performance monitor. // // This program logs many different system quantities to a log file // for later processing. // // Author: Blair Zajac . // // Portions copied from percollator.se written by Adrian Cockroft. // // Version 1.16: Oct 7, 1999 To keep backwards compatibility, define // WATCH_WEB if WATCH_HTTPD is defined. // If the COMPRESSOR environmental variable // is defined, then when a new log file is opened // for a new day, the just closed log file is // compressed using the COMPESSOR command in the // following manner: // system(sprintf("%s %s &", COMPRESSOR, log_file) // COMPRESSOR should be set to something like // "gzip -9", or "compress", or "bzip2 -9". // Version 1.15: Oct 5, 1999 kvm$mpid is a int not a long. This caused // problems on Solaris 7 hosts running a 64 // bit kernel. // Version 1.14: Oct 1, 1999 Rename disk.c?t?d? column names to // disk_runp.c?t?d? to better reflect the data // being recorded and to allow for more per disk // information later. // Version 1.13: Sep 24, 1999 Fix a bug in the disk_mean calculation where // it was being divided by the wrong disk_count. // Now it should be much larger and in scale with // disk_peak. When WATCH_DISK is defined, now // print each disk's run percent. Add a new // define WATCH_MOUNTS, which reports each mount // point's disk space and inode capacity, usage, // available for non-root users and percent used. // This comes from Duncan Lawie tyger at hoopoes.com. // Add some smarts so that if the number of // interfaces, physical disk, or mounted // partitions changes, then a new header is // printed. This will prevent column name and // data mixups when the system configuration // changes. // Version 1.12: Sep 14, 1999 Add the page scan rate as scanrate in // measure_cpu. // Version 1.11: Aug 13, 1999 Add the number of CPUs as ncpus. Move // measure_disk and measure_ram sooner in the // list of subsystems to handle. Increase the // number of characters for each network // interface from four to five. Add new disk // reads, writes, Kbytes read, and Kbytes // written per second. Add number of bytes // of free memory in bytes as freememK. // Version 1.10: Jul 28, 1999 Measure the process spawn rate if WATCH_CPU // is defined and the user is root. // Version 1.9: Jun 2, 1999 If WATCH_YAHOO is defined, then process the // access log as a Yahoo! style access log. // Restructure the code to handle different // web server access log formats. // Version 1.8: Jun 1, 1999 If the environmental variable WEB_SERVER is // defined, use its value of the as the name // of the process to count for the number of // web servers on the system. If WEB_SERVER // is not defined, then count number of httpd's. // Version 1.7: Mar 25, 1999 Simplify and speed up count_proc by 20%. // Version 1.6: Feb 23, 1999 Print pvm.user_time and system_time correctly. // Version 1.5: Feb 23, 1999 Always write header to a new file. // Version 1.4: Feb 19, 1999 Handle missing HTTP/1.x in access log. // Version 1.3: Feb 18, 1999 On busy machines httpops5 will be enlarged. // Version 1.2: Feb 18, 1999 Output data on integral multiples of interval. // Version 1.1: Feb 18, 1999 Integrate Squid log processing from SE 3.1. // Version 1.0: Sep 9, 1998 Initial version. // // The default sampling interval in seconds. #define SAMPLE_INTERVAL 300 // The maximum number of colums of data. #define MAX_COLUMNS 512 // Define the different parts of the system you want to examine. #ifdef WATCH_OS #define WATCH_CPU 1 #define WATCH_MUTEX 1 #define WATCH_NET 1 #define WATCH_TCP 1 #define WATCH_NFS 1 #define WATCH_MOUNTS 1 #define WATCH_DISK 1 #define WATCH_DNLC 1 #define WATCH_INODE 1 #define WATCH_RAM 1 #define WATCH_PAGES 1 #endif // Keep backwards compatibility with WATCH_HTTPD. #ifdef WATCH_HTTPD #define WATCH_WEB 1 #endif #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef WATCH_MOUNTS #include #include #endif #if WATCH_CPU || WATCH_WEB #include #ifdef WATCH_CPU // This is the maximum pid on Solaris hosts. #define DEFAULT_MAXPID 30000 #include #endif #ifdef WATCH_WEB #include // Define this macro which returns the size index for a file of a // particular size. This saves the overhead of a function call. #define WWW_SIZE_INDEX(size, size_index) \ if (size < 1024) { \ size_index=0; /* under 1KB */ \ } else { \ if (size < 10240) { \ size_index=1; /* under 10K */ \ } else { \ if (size < 102400) { \ size_index=2; /* under 100KB */ \ } else { \ if (size < 1048576) { \ size_index=3; /* under 1MB */ \ } else { \ size_index=4; /* over 1MB */ \ } \ } \ } \ } \ dwnld_size[size_index]++; // Handle the reply code from the server. #define WWW_REPLY_CODE(word) \ if (word != nil) { \ if (word == "304") { \ httpop_condgets++; \ } \ else { \ first_byte = word; \ if (first_byte[0] == '4' || first_byte[0] == '5') { \ httpop_errors++; \ } \ } \ } // Handle the method of the object served. This define only works with // non-proxy servers. #define WWW_METHOD1(word) \ if (word != nil) { \ switch (word) { \ case "get": \ case "GET": \ httpop_gets++; \ break; \ case "post": \ case "POST": \ httpop_posts++; \ break; \ case "head": \ case "HEAD": \ ishead = 1; \ httpop_condgets++; \ break; #ifdef WATCH_SQUID #define WWW_METHOD2 \ case "icp_query": \ case "ICP_QUERY": \ squid_icp_queries++; \ break; #else #define WWW_METHOD2 #endif #define WWW_METHOD_END \ default: \ break; \ } \ } #define WWW_METHOD(word) WWW_METHOD1(word) WWW_METHOD2 WWW_METHOD_END #endif #endif // Put all rules here so they can be accessed by the handle functions. lr_cpu_t lr_cpu$cpu; lr_cpu_t tmp_lrcpu; lr_mutex_t lr_mutex$m; lr_mutex_t tmp_mutex; lr_net_t lr_net$nr; lr_net_t tmp_nr; lr_tcp_t lr_tcp$tcp; lr_tcp_t tmp_lrtcp; #ifdef WATCH_TCP tcp tcp$tcp; tcp tmp_tcp; #endif lr_rpcclient_t lr_rpcclient$r; lr_rpcclient_t tmp_lrpcc; lr_disk_t lr_disk$dr; lr_disk_t tmp_dr; lr_dnlc_t lr_dnlc$dnlc; lr_dnlc_t tmp_lrdnlc; lr_inode_t lr_inode$inode; lr_inode_t tmp_lrinode; lr_ram_t lr_ram$ram; lr_ram_t tmp_lrram; #ifdef WATCH_PAGES ks_system_pages kstat$pages; ks_system_pages tmp_kstat_pages; #endif lr_swapspace_t lr_swapspace$s; lr_swapspace_t tmp_lrswap; lr_kmem_t lr_kmem$kmem; lr_kmem_t tmp_lrkmem; ks_system_misc kstat$misc; ks_system_misc tmp_kstat_misc; // Put application globals here. string nodename; // Name of this machine. string program_name; // Name of this program. int hz; // Clock tick rate. int page_size; // Page size in bytes. long boot_time; // Boot time of the system. long interval = SAMPLE_INTERVAL; // Sampling interval. #ifdef WATCH_CPU int can_read_kernel = 0; // If the kernel can be read. int kvm$mpid; // The last created PID. // These variables store the mpid before and after the standard interval. int mpid_previous; int mpid_current; ulonglong mpid_then; ulonglong mpid_now; // These variables store the mpid before and after 5 second intervals. int mpid5_previous; int mpid5_current; ulonglong mpid5_then; ulonglong mpid5_now; double mpid5_rate; #endif #ifdef WATCH_MOUNTS mnttab_t mnt$mnt; mnttab_t tmp_mnt; #endif // Variables for handling the httpd access log. #ifdef WATCH_WEB string www_search_url = getenv("SEARCHURL"); string www_server_proc_name = getenv("WEB_SERVER"); string www_log_filename = getenv("WEB_LOG"); string www_gateway = getenv("GATEWAY"); ulong www_fd; uint www_gatelen; stat_t www_stat[1]; ulong www_ino; long www_size; double www_interval; // Hi-res interval time. ulonglong www_then; ulonglong www_now; double www5_interval; // Actual hi-res 5 second interval. ulonglong www5_then; ulonglong www5_now; double httpops; double httpops5; double gateops; double dtmp; long httpop_gets; long httpop_condgets; // HEAD or code = 304 conditional get no data. long httpop_posts; long httpop_cgi_bins; long httpop_searches; long httpop_errors; long dwnld_size[5]; // [0] < 1K, [1] < 10K, [2] < 100K, [3] < 1M, [4] >= 1M long dwnld_totalz; // total size counted from log #if WATCH_PROXY || WATCH_SQUID || WATCH_YAHOO // If we're watching a Yahoo log, then take the transfer time to be the // processing time. double www_dwnld_time_sum; // transfer time double www_dwnld_time_by_size[5]; // mean transfer time by size bin #endif #if WATCH_PROXY || WATCH_SQUID long prxy_squid_indirect; // number of hits that go via PROXY,SOCKS,parent long prxy_squid_cache_hits; // number of hits returned from cache #endif #ifdef WATCH_PROXY long prxy_cache_writes; // number of writes and updates to cache long prxy_uncacheable; // number of explicitly uncacheable httpops // any extra is errors or incomplete ops #endif #ifdef WATCH_SQUID long squid_cache_misses; long squid_icp_requests; long squid_icp_queries; long squid_client_http; #endif #endif // Variables for handling output. string compress = getenv("COMPRESSOR"); // How to compress logs. ulong ofile; // File pointer to the logging file. string col_comment[MAX_COLUMNS]; // Comments for each column. string col_data[MAX_COLUMNS]; // Data for each column. int current_column = 0; // The current column. int print_header = 1; // Flag to flush header. // Send the stored columns of information to the output. print_columns(string data[]) { int i; for (i=0; i= MAX_COLUMNS) { fprintf(stderr, "%s: too many columns (%d). Increase MAX_COLUMNS.\n", program_name, current_column); exit(1); } col_comment[current_column] = comment; col_data[current_column] = data; ++current_column; } flush_output() { if (print_header != 0) { print_columns(col_comment); print_header = 0; } print_columns(col_data); current_column = 0; } // Sets ofile to the output file pointer. Creates or appends to the // log file if OUTDIR is set, otherwise sets the file pointer to STDOUT. // It start a new log file each day. It compresses the previous days // log file if the environmental variable COMPRESSOR is set. checkoutput(tm_t now) { string outdir = getenv("OUTDIR"); string outname; tm_t then; char tm_buf[32]; if (outdir == nil) { // No output dir so use stdout. if (ofile == 0) { // First time, so print header and set ofile. ofile = stdout; print_header = 1; } return; } // Maintain daily output logfiles in OUTDIR. if (now.tm_yday != then.tm_yday) { // First time or day has changed, start new logfile. if (ofile != 0) { // Close and optionally compress the existing output file. fclose(ofile); if (compress != nil) { system(sprintf(compress, outname)); } } strftime(tm_buf, sizeof(tm_buf), "%Y-%m-%d", now); outname = sprintf("%s/percol-%s", outdir, tm_buf); // Open for append either way. ofile = fopen(outname, "a"); if (ofile == 0) { perror("can't open output logfile"); exit(1); } // Always write header. print_header = 1; then = now; } } int main(int argc, string argv[]) { utsname_t u[1]; long now; long sleep_till; // Time to sleep to. tm_t tm_now; // Get the nodename of the machine. uname(u); nodename = u[0].nodename; program_name = argv[0]; // Handle the command line arguments. switch (argc) { case 1: break; case 2: interval = atoi(argv[1]); break; default: fprintf(stderr, "usage: se [Defines] %s [interval]\n", program_name); fprintf(stderr, "%s can use the following environmental variables:\n", program_name); fprintf(stderr, " setenv OUTDIR /var/orcallator/logs - log file directory, default stdout\n"); fprintf(stderr, " setenv WEB_SERVER apache - string to search for number of web servers\n"); fprintf(stderr, " setenv WEB_LOG /ns-home/httpd-80/logs/access - location of web server log\n"); fprintf(stderr, " setenv GATEWAY some.where.com - special address to monitor\n"); fprintf(stderr, " setenv SEARCHURL srch.cgi - match for search scripts, default is search.cgi\n"); fprintf(stderr, " setenv COMPRESSOR \"gzip -9\" - compress previous day logs using this command\n"); fprintf(stderr, "Defines:\n"); fprintf(stderr, " -DWATCH_WEB watch web server access logs\n"); fprintf(stderr, " -DWATCH_PROXY use WEB_LOG as a NCSA style proxy log\n"); fprintf(stderr, " -DWATCH_SQUID use WEB_LOG as a Squid log\n"); fprintf(stderr, " -DWATCH_OS includes all of the below:\n"); fprintf(stderr, " -DWATCH_CPU watch the cpu load, run queue, etc\n"); fprintf(stderr, " -DWATCH_MUTEX watch the number of mutex spins\n"); fprintf(stderr, " -DWATCH_NET watch all Ethernet interfaces\n"); fprintf(stderr, " -DWATCH_TCP watch all the TCP/IP stack\n"); fprintf(stderr, " -DWATCH_NFS watch NFS requests\n"); fprintf(stderr, " -DWATCH_MOUNTS watch usage of mount points\n"); fprintf(stderr, " -DWATCH_DISK watch disk read/write usage\n"); fprintf(stderr, " -DWATCH_DNLC watch the directory name lookup cache\n"); fprintf(stderr, " -DWATCH_INODE watch the inode cache\n"); fprintf(stderr, " -DWATCH_RAM watch memory usage\n"); fprintf(stderr, " -DWATCH_PAGES watch where pages are allocated\n"); exit(1); break; } // Initialize the various structures. initialize(); // Run forever. If WATCH_WEB is defined, then have measure_web() // do the sleeping while it is watching the access log file until the // next update time for the whole operating system. Also, collect the // data from the access log file before printing any output. for (;;) { // Calculate the next time to sleep to that is an integer multiple of // the interval time. Make sure that at least half of the interval // passes before waking up. now = time(0); sleep_till = (now/interval)*interval; while (sleep_till < now + interval*0.5) { sleep_till += interval; } #ifdef WATCH_WEB measure_web(sleep_till); #else sleep_till_and_count_new_proceses(sleep_till); #endif // Get the current time. now = time(0); tm_now = localtime(&now); measure_os(now, tm_now); #ifdef WATCH_WEB put_httpd(); #endif // Get a file descriptor to write to. Maintains daily output files. checkoutput(tm_now); // Print the output. flush_output(); } return 0; } initialize() { #ifdef WATCH_CPU int i; #endif // Get the command to compress the log files. if (compress == nil || compress == "") { compress = nil; } else { compress = sprintf("%s %%s &", compress); } #ifdef WATCH_CPU // Initialize the process spawning rate measurement variables. // Determine if the kernel can be read to measure the last pid. i = open("/dev/kmem", O_RDONLY); if (i != -1) { close(i); can_read_kernel = 1; mpid_previous = kvm$mpid; mpid_then = gethrtime(); mpid_current = mpid_previous; mpid5_then = mpid_then; mpid5_previous = mpid_previous; mpid5_current = mpid_previous; mpid5_rate = 0; } #endif #ifdef WATCH_WEB // Initialize those variables that were not set with environmental // variables. if (www_search_url == nil || www_search_url == "") { www_search_url = "search.cgi"; } if (www_server_proc_name == nil || www_server_proc_name == "") { www_server_proc_name = "httpd"; } if (www_gateway == nil || www_gateway == "" ) { www_gateway = "NoGatway"; www_gatelen = 0; } else { www_gatelen = strlen(www_gateway); } // Initialize the web server watching variables. Move the file pointer // to the end of the web access log and note the current time. if (www_log_filename != nil) { www_fd = fopen(www_log_filename, "r"); if (www_fd != 0) { stat(www_log_filename, www_stat); www_ino = www_stat[0].st_ino; www_size = www_stat[0].st_size; // Move to the end of the file. fseek(www_fd, 0, 2); } } www_then = gethrtime(); www5_then = www_then; #endif // Sleep to give the disks a chance to update. sleep(DISK_UPDATE_RATE); // Get the clock tick rate. hz = sysconf(_SC_CLK_TCK); // Get the page size. page_size = sysconf(_SC_PAGESIZE); // Calculate the system boot time. boot_time = time(0) - (kstat$misc.clk_intr / hz); // Perform the first measurement of the system. _measure_os(); } // Measure the system statistics all at once. _measure_os() { tmp_lrcpu = lr_cpu$cpu; tmp_mutex = lr_mutex$m; tmp_nr = lr_net$nr; tmp_lrtcp = lr_tcp$tcp; #ifdef WATCH_TCP tmp_tcp = tcp$tcp; #endif tmp_lrpcc = lr_rpcclient$r; tmp_dr = lr_disk$dr; tmp_lrdnlc = lr_dnlc$dnlc; tmp_lrinode = lr_inode$inode; tmp_lrram = lr_ram$ram; #ifdef WATCH_PAGES tmp_kstat_pages = kstat$pages; #endif tmp_lrswap = lr_swapspace$s; tmp_lrkmem = lr_kmem$kmem; tmp_kstat_misc = kstat$misc; } measure_os(long now, tm_t tm_now) { // Measure the system now. _measure_os(); // Take care of miscellaneous measurements. measure_misc(now, tm_now); // Take care of cpu. #ifdef WATCH_CPU measure_cpu(); #endif // Take care of mutexes. #ifdef WATCH_MUTEX measure_mutex(); #endif // Take care of mount pointes. #ifdef WATCH_MOUNTS measure_mounts(); #endif // Take care of the disks. #ifdef WATCH_DISK measure_disk(); #endif // Take care of ram. #ifdef WATCH_RAM measure_ram(); #endif // Take care of the network. #ifdef WATCH_NET measure_net(); #endif // Take care of TCP/IP. #ifdef WATCH_TCP measure_tcp(); #endif // Take care of NFS. #ifdef WATCH_NFS measure_nfs(); #endif // Take care of DNLC. #ifdef WATCH_DNLC measure_dnlc(); #endif // Take care of the inode cache. #ifdef WATCH_INODE measure_inode(); #endif // Take care of page allocations. #ifdef WATCH_PAGES measure_pages(); #endif } /* state as a character */ char state_char(int state) { switch(state) { case ST_WHITE: return 'w'; /* OK states are lower case */ case ST_BLUE: return 'b'; case ST_GREEN: return 'g'; case ST_AMBER: return 'A'; /* bad states are upper case to stand out */ case ST_RED: return 'R'; case ST_BLACK: return 'B'; default: return 'I'; /* invalid state */ } } measure_misc(long now, tm_t tm_now) { long uptime; char states[12]; char tm_buf[16]; uptime = now - boot_time; states = "wwwwwwwwwww"; strftime(tm_buf, sizeof(tm_buf), "%T", tm_now); states[0] = state_char(lr_disk$dr.state); states[1] = state_char(lr_net$nr.state); states[2] = state_char(lr_rpcclient$r.state); states[3] = state_char(lr_swapspace$s.state); states[4] = state_char(lr_ram$ram.state); states[5] = state_char(lr_kmem$kmem.state); states[6] = state_char(lr_cpu$cpu.state); states[7] = state_char(lr_mutex$m.state); states[8] = state_char(lr_dnlc$dnlc.state); states[9] = state_char(lr_inode$inode.state); states[10]= state_char(lr_tcp$tcp.state); put_output(" timestamp", sprintf("%10d", now)); put_output("locltime", tm_buf); put_output("DNnsrkcmdit", states); put_output(" uptime", sprintf("%8d", uptime)); } sleep_till_and_count_new_proceses(long sleep_till) { long now; #ifdef WATCH_CPU long sleep_till1; int mpid5_diff; double mpid5_interval; double rate; #endif now = time(0); while (now < sleep_till) { #ifdef WATCH_CPU if (can_read_kernel != 0) { // Sleep at least 5 seconds to make a measurement. sleep_till1 = now + 5; while (now < sleep_till1) { sleep(sleep_till1 - now); now = time(0); } // Measure the 5 second process creation rate. mpid5_current = kvm$mpid; mpid5_now = gethrtime(); mpid5_interval = (mpid5_now - mpid5_then) * 0.000000001; mpid5_then = mpid5_now; if (mpid5_current >= mpid5_previous) { mpid5_diff = mpid5_current - mpid5_previous; } else { mpid5_diff = mpid5_current + DEFAULT_MAXPID - mpid5_previous; } rate = mpid5_diff/mpid5_interval; if (rate > mpid5_rate) { mpid5_rate = rate; } mpid5_previous = mpid5_current; // Now take these results to measure the long interval rate. // Because the mpid may flip over DEFAULT_MAXPID more than once // in the long interval time span, use the difference between // the previous and current mpid over a 5 second interval to // calculate the long interval difference. mpid_current += mpid5_diff; mpid_now = mpid5_now; } else { sleep(sleep_till - now); } #else sleep(sleep_till - now); #endif now = time(0); } } #ifdef WATCH_CPU measure_cpu() { p_vmstat pvm; double mpid_interval; double mpid_rate; pvm = vmglobal_total(); // In SE 3.0 user_time and system_time are int and in SE 3.1 they are // double, so cast everything to double using + 0.0. put_output(" usr%", sprintf("%5.1f", pvm.user_time + 0.0)); put_output(" sys%", sprintf("%5.1f", pvm.system_time + 0.0)); put_output(" 1runq", sprintf("%6.2f", tmp_kstat_misc.avenrun_1min/256.0)); put_output(" 5runq", sprintf("%6.2f", tmp_kstat_misc.avenrun_5min/256.0)); put_output("15runq", sprintf("%6.2f", tmp_kstat_misc.avenrun_15min/256.0)); put_output("#proc", sprintf("%5lu", tmp_kstat_misc.nproc)); put_output("scanrate", sprintf("%8.3f", pvm.scan)); // Calculate the rate of new process spawning. if (can_read_kernel != 0) { mpid_interval = (mpid_now - mpid_then) * 0.000000001; mpid_rate = (mpid_current - mpid_previous) / mpid_interval; put_output("#proc/s", sprintf("%7.3f", mpid_rate)); put_output("#proc/p5s", sprintf("%9.4f", mpid5_rate)); // Reset counters. mpid_then = mpid_now; mpid_previous = mpid_current; mpid5_rate = 0; } } #endif #ifdef WATCH_MUTEX measure_mutex() { put_output(" smtx", sprintf("%5d", tmp_mutex.smtx)); put_output("smtx/cpu", sprintf("%8d", tmp_mutex.smtx/tmp_mutex.ncpus)); put_output("ncpus", sprintf("%5d", tmp_mutex.ncpus)); } #endif #ifdef WATCH_NET measure_net() { int previous_count = -1; int current_count; int i; current_count = 0; for (i=0; i peak_disk_busy) { peak_disk_busy = GLOBAL_disk[i].run_percent; } } mean_disk_busy = mean_disk_busy/GLOBAL_disk_count; put_output("disk_peak", sprintf("%9.3f", peak_disk_busy)); put_output("disk_mean", sprintf("%9.3f", mean_disk_busy)); put_output("disk_rd/s", sprintf("%9.1f", total_reads)); put_output("disk_wr/s", sprintf("%9.1f", total_writes)); put_output("disk_rK/s", sprintf("%9.1f", total_readk)); put_output("disk_wK/s", sprintf("%9.1f", total_writek)); // If the number of disks has changed, say due to a add_drv, then print // new headers. if (previous_count != GLOBAL_disk_count) { print_header = 1; previous_count = GLOBAL_disk_count; } } #endif #ifdef WATCH_DNLC measure_dnlc() { put_output("dnlc_ref/s", sprintf("%10.3f", tmp_lrdnlc.refrate)); put_output("dnlc_hit%", sprintf("%9.3f", tmp_lrdnlc.hitrate)); } #endif #ifdef WATCH_INODE measure_inode() { put_output("inod_ref/s", sprintf("%10.3f", tmp_lrinode.refrate)); put_output("inod_hit%", sprintf("%9.3f", tmp_lrinode.hitrate)); put_output("inod_stl/s", sprintf("%10.3f", tmp_lrinode.iprate)); } #endif #ifdef WATCH_RAM measure_ram() { put_output("swap_avail", sprintf("%10ld", GLOBAL_pvm[0].swap_avail)); put_output("page_rstim", sprintf("%10d", tmp_lrram.restime)); put_output(" freememK", sprintf("%10d", GLOBAL_pvm[0].freemem)); put_output("free_pages", sprintf("%10d", (GLOBAL_pvm[0].freemem*1024)/page_size)); } #endif #ifdef WATCH_PAGES measure_pages() { put_output("pp_kernel", sprintf("%9lu", tmp_kstat_pages.pp_kernel)); put_output("pagesfree", sprintf("%9lu", tmp_kstat_pages.pagesfree)); put_output("pageslock", sprintf("%9lu", tmp_kstat_pages.pageslocked)); put_output("pagesio", sprintf("%7lu", tmp_kstat_pages.pagesio)); put_output("pagestotl", sprintf("%9lu", tmp_kstat_pages.pagestotal)); } #endif #ifdef WATCH_WEB // Breakdown access log format. accesslog(string buf) { int z; int size_index; int ishead; string word; char first_byte[1]; #if WATCH_PROXY || WATCH_SQUID || WATCH_YAHOO double xf; #ifdef WATCH_SQUID string logtag; string request; #endif #ifdef WATCH_YAHOO string arg; ulong ptr; ulong tmp; ulong ulong_xf; #endif #endif ishead = 0; #ifdef WATCH_YAHOO // Make sure that the input line has at least 32 bytes of data plus a new // line, for a total length of 33. if (strlen(buf) < 33) { return; } word = strtok(buf,"\05"); #else word = strtok(buf," "); #endif if (word == nil) { return; } #ifdef WATCH_SQUID // Word contains unix time in seconds.milliseconds. word = strtok(nil, " "); // Elapsed request time in ms xf = atof(word)/1000.0; www_dwnld_time_sum += xf; #ifdef DINKY printf("time: %s %f total %f\n", word, xf, xfer_sum); #endif word = strtok(nil, " "); // Client IP address logtag = strtok(nil, "/"); // Log tag word = strtok(nil, " "); // Reply code WWW_REPLY_CODE(word) word = strtok(nil, " "); // Size sent to client z = atoi(word); request = strtok(nil, " "); // Request method word = strtok(nil, " "); // URL if (word != nil) { if (word =~ "cgi-bin") { httpop_cgi_bins++; } if (word =~ www_search_url) { httpop_searches++; } } word = strtok(nil, " "); // Optional user ident word = strtok(nil, "/"); // Hierarchy if (word != nil) { if (word =~ "DIRECT") { prxy_squid_indirect++; } } word = strtok(nil, " "); // Hostname word = strtok(nil, " "); // Content-type // Process the collected data. if (logtag =~ "TCP") { squid_client_http++; } if (logtag =~ "UDP") { squid_icp_requests++; } if (logtag =~ "HIT") { prxy_squid_cache_hits++; } if (logtag =~ "MISS") { squid_cache_misses++; } WWW_METHOD(request) // Do not add size if its a HEAD. if (ishead == 0) { dwnld_totalz += z; } WWW_SIZE_INDEX(z, size_index) www_dwnld_time_by_size[size_index] += xf; #elif WATCH_YAHOO // Yahoo log format. Fields in square brackets will only appear in the // log file if the data actually exists (ie. you will never see a null // Referrer field). Further, fields labelled here with "(CONFIG)" will // only appear if they are enabled via the YahooLogOptions configuration // directive. // // IP Address (8 hex digits) // Timestamp (time_t as 8 hex digits) // Processing Time (in microseconds, as 8 hex digits) // Bytes Sent (8 hex digits) // URL // [^Er referrer] (CONFIG) // [^Em method] (CONFIG) // [^Es status_code] // ^Ed signature // \n // Ignore the IP address and timestamp. Get the processing time, the // number of bytes sent and the URL. For each portion of the line, split // it up into separate pieces. if (sscanf(word, "%8lx%8lx%8x%8x", &tmp, &tmp, &ulong_xf, &z) != 4) { return; } xf = ulong_xf/1000000.0; WWW_SIZE_INDEX(z, size_index) www_dwnld_time_sum += xf; www_dwnld_time_by_size[size_index] += xf; if (word =~ "cgi-bin") { httpop_cgi_bins++; } if (word =~ www_search_url) { httpop_searches++; } for (;;) { word = strtok(nil, "\05"); if (word == nil) { break; } first_byte = word; ptr = &word + 1; arg = ((string) ptr); ptr = 0; switch (first_byte[0]) { case 'm': WWW_METHOD(arg) ptr = 1; break; case 's': WWW_REPLY_CODE(arg) break; default: break; } } // If no method was seen, then assume it was a GET. if (ptr == 0) { httpop_gets++; } // Do not add size if its a HEAD. if (ishead == 0) { dwnld_totalz += z; } #else /* common or netscape proxy formats */ strtok(nil, " "); // - strtok(nil, " "); // - strtok(nil, " ["); // date strtok(nil, " "); // zone] word = strtok(nil, " \""); // GET or POST WWW_METHOD(word) word = strtok(nil, " "); // URL if (word != nil) { if (word =~ "cgi-bin") { httpop_cgi_bins++; } if (word =~ www_search_url) { httpop_searches++; } } // Sometimes HTTP/1.x is not listed in the access log. Skip it // if it does exist. Load the error/success code. word = strtok(nil, " "); if (word != nil && (word =~ "HTTP" || word =~ "http")) { word = strtok(nil, " "); } WWW_REPLY_CODE(word) word = strtok(nil, " "); // Bytes transferred. if (word != nil) { z = atoi(word); if (ishead == 0) { // Do not add size if its a HEAD. dwnld_totalz += z; } WWW_SIZE_INDEX(z, size_index) } #ifdef WATCH_PROXY word = strtok(nil, " "); // status from server word = strtok(nil, " "); // length from server word = strtok(nil, " "); // length from client POST word = strtok(nil, " "); // length POSTed to remote word = strtok(nil, " "); // client header req word = strtok(nil, " "); // proxy header resp word = strtok(nil, " "); // proxy header req word = strtok(nil, " "); // server header resp word = strtok(nil, " "); // transfer total secs word = strtok(nil, " "); // route // - DIRECT PROXY(host.domain:port) SOCKS if (word != nil) { if (strncmp(word, "PROXY", 5) == 0 || strncmp(word, "SOCKS", 5) == 0) { prxy_squid_indirect++; } } word = strtok(nil, " "); // client finish status word = strtok(nil, " "); // server finish status word = strtok(nil, " "); // cache finish status // ERROR HOST-NOT-AVAILABLE = error or incomplete op // WRITTEN REFRESHED CL-MISMATCH(content length mismatch) = cache_writes // NO-CHECK UP-TO-DATE = cache_hits // DO-NOT-CACHE NON-CACHEABLE = uncacheable if (word != nil) { switch(word) { case "WRITTEN": case "REFRESHED": case "CL-MISMATCH": prxy_cache_writes++; break; case "NO-CHECK": case "UP-TO-DATE": prxy_squid_cache_hits++; break; case "DO-NOT-CACHE": case "NON-CACHEABLE": prxy_uncacheable++; break; default: break; } } word = strtok(nil, " ["); // [transfer total time x.xxx if (word != nil) { xf = atof(word); www_dwnld_time_sum += xf; www_dwnld_time_by_size[size_index] += xf; } #endif #endif } measure_web(long sleep_till) { double lastops = 0.0; char buf[BUFSIZ]; int i; long now; httpops = 0.0; httpops5 = 0.0; gateops = 0.0; httpop_gets = 0; httpop_condgets = 0; httpop_posts = 0; httpop_cgi_bins = 0; httpop_errors = 0; httpop_searches = 0; for (i=0; i<5; i++) { dwnld_size[i] = 0; #if WATCH_PROXY || WATCH_SQUID || WATCH_YAHOO www_dwnld_time_by_size[i] = 0.0; #endif } dwnld_totalz = 0; #if WATCH_PROXY || WATCH_SQUID || WATCH_YAHOO www_dwnld_time_sum = 0.0; #endif #if WATCH_PROXY || WATCH_SQUID prxy_squid_indirect = 0; prxy_squid_cache_hits = 0; #ifdef WATCH_PROXY prxy_cache_writes = 0; prxy_uncacheable = 0; #else squid_cache_misses = 0; squid_icp_requests = 0; squid_icp_queries = 0; squid_client_http = 0; #endif #endif if (www_log_filename != nil) { now = time(0); while (now < sleep_till) { #ifdef WATCH_CPU sleep_till_and_count_new_proceses(now + 5); #else sleep(5); #endif now = time(0); if (www_fd != 0) { while (fgets(buf, BUFSIZ, www_fd) != nil) { httpops += 1.0; if (www_gatelen > 0) { if (strncmp(buf, www_gateway, www_gatelen) == 0) { gateops += 1.0; } } accesslog(buf); } } /* see if the file has been switched or truncated */ stat(www_log_filename, www_stat); if (www_ino != www_stat[0].st_ino || www_size > www_stat[0].st_size) { if (www_fd != 0) { fclose(www_fd); /* close the old log */ } /* log file has changed, open the new one */ www_fd = fopen(www_log_filename, "r"); if (www_fd != 0) { www_ino = www_stat[0].st_ino; while(fgets(buf, BUFSIZ, www_fd) != nil) { httpops += 1.0; if (www_gatelen > 0) { if (strncmp(buf, www_gateway, www_gatelen) == 0) { gateops += 1.0; } } accesslog(buf); } } } www5_now = gethrtime(); www5_interval = (www5_now - www5_then) * 0.000000001; www5_then = www5_now; dtmp = (httpops - lastops)/www5_interval; if (dtmp > httpops5) { httpops5 = dtmp; } lastops = httpops; // Remember size for next time. www_size = www_stat[0].st_size; } } else { sleep_till_and_count_new_proceses(sleep_till); www5_now = gethrtime(); } www_now = www5_now; www_interval = (www_now - www_then) * 0.000000001; www_then = www_now; // Use dtmp to get percentages. if (httpops == 0.0) { dtmp = 0.0; } else { dtmp = 100.0 / httpops; } #if WATCH_PROXY || WATCH_SQUID || WATCH_YAHOO for (i=0; i<5; i++) { if (dwnld_size[i] == 0) { www_dwnld_time_by_size[i] = 0.0; } else { www_dwnld_time_by_size[i] = www_dwnld_time_by_size[i]/dwnld_size[i]; } } #endif } int count_proc(string name) { int count; prpsinfo_t p; count = 0; for (p=first_proc(); p.pr_pid != -1; p=next_proc()) { if (p.pr_fname =~ name) { count++; } } return count; } put_httpd() { put_output("#httpds", sprintf("%7d", count_proc(www_server_proc_name))); put_output("httpop/s", sprintf("%8.2f", httpops/www_interval)); put_output("http/p5s", sprintf("%8.2f", httpops5)); put_output("cndget/s", sprintf("%8.2f", httpop_condgets/www_interval)); put_output("search/s", sprintf("%8.3f", httpop_searches/www_interval)); put_output(" cgi/s", sprintf("%8.3f", httpop_cgi_bins/www_interval)); put_output(" htErr/s", sprintf("%8.3f", httpop_errors/www_interval)); put_output(" httpb/s", sprintf("%8.0f", dwnld_totalz/www_interval)); put_output(" %to1KB", sprintf("%8.2f", dtmp*dwnld_size[0])); put_output(" %to10KB", sprintf("%8.2f", dtmp*dwnld_size[1])); put_output("%to100KB", sprintf("%8.2f", dtmp*dwnld_size[2])); put_output(" %to1MB", sprintf("%8.2f", dtmp*dwnld_size[3])); put_output("%over1MB", sprintf("%8.2f", dtmp*dwnld_size[4])); put_output(www_gateway, sprintf("%8.2f", gateops/www_interval)); #if WATCH_PROXY || WATCH_SQUID put_output(" %indir", sprintf("%8.2f", dtmp * prxy_squid_indirect)); put_output("%cch_hit", sprintf("%8.2f", dtmp * prxy_squid_cache_hits)); #ifdef WATCH_PROXY put_output("%cch_wrt", sprintf("%8.2f", dtmp * prxy_cache_writes)); put_output("%cch_unc", sprintf("%8.2f", dtmp * prxy_uncacheable)); #else put_output("%cch_mis", sprintf("%8.2f", dtmp * squid_cache_misses)); put_output("%cch_req", sprintf("%8.2f", dtmp * squid_icp_requests)); put_output("%cch_qry", sprintf("%8.2f", dtmp * squid_icp_queries)); #endif put_output(" xfr_t", sprintf("%8.2f", 0.01 * dtmp * www_dwnld_time_sum)); put_output(" xfr1_t", sprintf("%8.2f", www_dwnld_time_by_size[0])); put_output(" xfr10_t", sprintf("%8.2f", www_dwnld_time_by_size[1])); put_output("xfr100_t", sprintf("%8.2f", www_dwnld_time_by_size[2])); put_output(" xfr1M_t", sprintf("%8.2f", www_dwnld_time_by_size[3])); put_output("xfro1M_t", sprintf("%8.2f", www_dwnld_time_by_size[4])); #elif WATCH_YAHOO put_output(" wprc_t", sprintf("%9.5f", 0.01 * dtmp * www_dwnld_time_sum)); put_output(" wprc1_t", sprintf("%9.5f", www_dwnld_time_by_size[0])); put_output(" wprc10_t", sprintf("%9.5f", www_dwnld_time_by_size[1])); put_output("wprc100_t", sprintf("%9.5f", www_dwnld_time_by_size[2])); put_output(" wprc1M_t", sprintf("%9.5f", www_dwnld_time_by_size[3])); put_output("wprco1M_t", sprintf("%9.5f", www_dwnld_time_by_size[4])); #endif } #endif From bzajac at geostaff.com Thu Oct 14 14:42:32 1999 From: bzajac at geostaff.com (Blair Zajac) Date: Thu, 14 Oct 1999 14:42:32 -0700 Subject: [Orca-announce] Much improved Orca 0.24 available Message-ID: <38064E48.EB0BE96D@geostaff.com> After many months of improvements, I have finished a new release of Orca. This does many of the things people have been requesting, such as measuring disk space usage and the capability to make plots with many Ethernet interfaces. I am including the major feature release at the end of this message. To see what Orca can do, check out http://www.geocities.com/ResearchTriangle/Thinktank/4996/orca-example/ To get Orca, go to http://www.geocities.com/ResearchTriangle/Thinktank/4996/pub/ and get orca-0.24.tar.gz. This version of orca includes the latest version of orcallator, 1.19. Enjoy, Blair PS To continue development of Orca, I'm looking for some people who would be willing to have a link from my pages to your Orca pages to show that Orca is being used. New in Orca version 0.24. 1) Installation notes. Due to the way Orca generated image and HTML files are named, read the INSTALL file. Otherwise, you will have some left over cruft that will waste disk space and you will have to reload all of your source data files. 2) Orca now runs under older Perls: Perl 5.004_01 or later. 3) Switch to generate PNGs instead of GIFs. They take up 1/3 less disk space and are created at least 10% faster. If you want Orca to generate GIFs instead of PNGs, give it the -gifs flag. 4) Read in .gz files using gunzip, .Z files using uncompress, and .bz2 files with bunzip2. 5) Add to the plot title the type of the plot (i.e. 'Yearly') in every image. 6) Add a href field for a plot. This, if defined, appends a HREF to a target name letting you point the user to a page to get more information. The default orcallator.cfg has these added to point to the new documentation web page http://www.geocities.com/~bzking/docs/orcallator.html for all orcallator.se recorded data. 7) Add a new option named base that lets the user specify either a base of 1000 or 1024 for autoscaling plots. This is useful for memory (base 1024) vs network (base 1000) measurements and is used in generating correctly calculating the base to use in calculating the Y axis. 8) The word logarithmic can now be used for a plot to create a logarithmic Y axis scale. 9) Orca no longer lists all the source files it finds to shorten verbose output. This output is now generated if the verbose level is greater than one. 10) Do not overwrite existing HTML files when creating new versions until the new version is finished. This allows people to better view existing pages until the new page is completely finished. 11) All generated HTML and image filenames are now created using a different set of mappings. Now orcallator -> o orca -> o _times_ -> _X_ _percent_ -> _pct_ _number_ -> _num_. All older installations of Orca will need to be renamed unless you want to load in all your data again. You can perform this renaming on your files by running make upgrade, or if you have directories that are not normally covered by the Orca install, run src/upgrade_installation followed by the list of directories to search in. Pass the -n flag to upgrade_installation if you want to see the renames that will be performed without actually performing them. 12) New HTML and image filenames are shorter, which can save 10% in disk space on a large installation according to du -k. Also, now plots containing arbitrary many data sources can be plotted without exceeding the maximum file lengths. 13) Add locking so that only one Orca can run on a single configuration file at one time. 14) Include and require RRDtool 1.0.7.2 and Math::Interpolate 1.05. Include Data::Dumper 2.101, Digest::MD5 2.09, Storable 0.6.5, with Orca. These following changes are what's new in orcallator.se 1.19 since version 1.7 which was included with Orca 0.23. All of the changes below are taken advantage of in the included orcallator.cfg and start_orcallator files. 15) Orcallator.se now has a web page describing the various measurements it makes. See http://www.geocities.com/~bzking/docs/orcallator.html. 16) If the environmental variable WEB_SERVER is defined, use its value of the as the name of the process to count for the number of web servers on the system. If WEB_SERVER is not defined, then count number of httpd's. 17) If the COMPRESSOR environmental variable is defined, then when a new log file is opened for a new day, the just closed log file is compressed using the COMPRESSOR command in the following manner: system(sprintf("%s %s &", COMPRESSOR, log_file) COMPRESSOR should be set to something like "gzip -9", or "compress", or "bzip2 -9". If the configure script finds both a compressor and uncompressor tool, such as both gzip and gunzip, then start_orcallator will inform orcallator.se to compress the log files at the end of a day. 18) New measurements. The first column lists the define that must be passed to the SE interpreter to obtain the measurement. If you use the WATCH_OS define, then all of these are now measured. WATCH_MUTEX - ncpus - number of CPUs on the system WATCH_CPU - #proc/s - 5 minute average process spawn rate if root WATCH_CPU - #proc/p5s - maximum 5 second process spawn rate if root WATCH_CPU - scanrate - page scan rate in pages per second WATCH_DISK - disk_runp_c\d+t\d+d\d+ - run percent for each disk WATCH_DISK - disk_rd/s - system wide read operations per second WATCH_DISK - disk_wr/s - system wide write operations per second WATCH_DISK - disk_rK/s - system wide kilobytes read per second WATCH_DISK - disk_wK/s - system wide kilobytes written per second WATCH_RAM - freememK - number of free kilobytes of memory on the system WATCH_MOUNTS - mntC_* - capacity of disk in kilobytes WATCH_MOUNTS - mntc_* - inode capacity of disk WATCH_MOUNTS - mntU_* - used capacity of disk in kilobytes WATCH_MOUNTS - mntu_* - used inode capacity of disk WATCH_MOUNTS - mntA_* - available kilobytes for non-root users WATCH_MOUNTS - mnta_* - available inodes for non-root users WATCH_MOUNTS - mntP_* - percentage of kilobytes used for non-root users WATCH_MOUNTS - mntp_* - percentage of inodes used for non-root users 19) Add some smarts so that if the number of interfaces, physical disks, or mounted partitions changes, then a new header is printed. This will prevent column name and data mixups when the system configuration changes. 20) Prevent a division by zero in calculating the mean_disk_busy if the number of disks on the system is 0. 21) Fix a bug in the disk_mean calculation where it was being divided by the wrong disk_count. Now it should be much larger and in scale with disk_peak. 22) Increase the number of characters for each network interface from four to five. 23) If WATCH_YAHOO is defined, then process the access log as a Yahoo! style access log. 24) Restructure the code to handle different web server access log formats easier. From bzajac at geostaff.com Wed Oct 20 18:43:40 1999 From: bzajac at geostaff.com (Blair Zajac) Date: Wed, 20 Oct 1999 18:43:40 -0700 Subject: [Orca-announce] ANNOUNCE: Orca 0.25 released Message-ID: <380E6FCC.FB1354D4@geostaff.com> New in Orca version 0.25. 1) I have moved companies from Yahoo!/GeoCities to Akamai. My email address has changed to blair at akamai.com and the old one will no longer work. Also, I have moved the my entire Orca archive from www.geocities.com/~bzking/ to www.gps.caltech.edu/~blair/orca/ 2) Fix an important bug where Orca would crash if column names mentioned in the configuration file did not exist in the source data files. 3) Orca, when it received the HUP signal, will look for new source data files the next time it runs through the main loop. If you have a constantly running Orca, this is a simpler and faster solution than restarting Orca, which takes time to reread all the source files. 4) Add a new make target orcallator_run_at_boot which will install the proper files into the proper /etc/*.d/ directories to start orcallator at boot time and shut it down at halt time. 5) The previous default orcallator.cfg would not find compressed orcallator files. 6) Remove the plot configuration option `optional' which made plots optional. Now make plots optional by default and use the keyword `required' to make them required. If a plot is required, then it will always display, even if there is no data loaded for it. 7) Change the behavior of warnings when data requested to be plotted in the configuration file does not exist in the source data files. Any resulting errors from this, such as cannot compile errors, are only sent when the verbose level is above 1 or when the plot is required. 8) Include Storable 0.6.7 but continue to only require Storable 0.6.3. About Orca Orca is a freely available, general use tool that collects numerical data from text files and using a configuration file that defines the data, generates a tree of HTML files that contain daily, weekly, monthly, quarterly, and yearly views of the data. A well designed configuration allows the user to view the same data from different sources, such as CPU usage across many hosts, or many different data from the same source. Bundled with Orca is a SE toolkit program orcallator.se that gathers many different system statistics for Solaris based systems, such as web server hits per second, CPU usage, hme0 bits/second, disk usage and run percent, page usage, etc. Orca's NEW home page is http://www.gps.caltech.edu/~blair/orca/ To see what Orca and orcallator can do, check out http://www.gps.caltech.edu/~blair/orca/orca-example/ To get Orca, go to http://www.gps.caltech.edu/~blair/orca/pub/ and get orca-0.25.tar.gz. This version of orca includes the latest version of orcallator, 1.20. For any discussions regarding Orca, please direct your email to one of the Orca mailing lists. For this list, please see http://www.gps.caltech.edu/~blair/orca/#mailing_lists Please contact me via email at blair at akamai.com. Enjoy, Blair PS To continue development of Orca, I'm looking for some people who would be willing to have a link from my pages to your Orca pages to show that Orca is being used. From blair at akamai.com Wed Oct 27 17:55:25 1999 From: blair at akamai.com (Blair Zajac) Date: Wed, 27 Oct 1999 17:55:25 -0700 Subject: [Orca-announce] Orca talk at ISI in Los Angeles Message-ID: <38179EFD.B6274C68@akamai.com> Good evening, Sorry for the late notice here, but I thought you would want to know about this. I am giving a talk in Los Angeles on Thursday, October 28, 1999 at 1:30 PM at ISI, the Information Sciences Insitute associated with USC located in Marina del Rey, on some work I did developing a data gathering and monitoring tool named Orca while I was at GeoCities. In addition, I will probably spend some time on discussing what it takes to serve one of the largest web sites in the world and what my new company does. I hope to see you at the talk. Blair http://www.isi.edu/divisions/div7/networking_seminar.html -------------- next part -------------- Networking Seminar Series USC/Information Sciences Institute Guest Speakers 1999 =================== [TABLE NOT SHOWN] Location -------- All talks are in the ISI 11th Floor large conference room unless otherwise noted. ISI is located at 4676 Admiralty Way; Marina del Rey, California 90292. Click here for directions. -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Blair Zajac, Ph.D. ------------------ Technical Yahoo! Yahoo!/GeoCities Thursday October 28, 1999 1:30 PM Monitoring Extremely Large Web Sites Abstract: What does it take to serve one of the web largest web sites? According to Media Metrix, GeoCities had the fourth largest reach of all domains on the web in August 1999. I will discuss Orca, a MRTG like tool, that I built to monitor short term and long term system statistics. This tool proved to be invaluable in diagnosing system performance problems. Bio: Blair Zajac is an Technical Yahoo! at Yahoo!/GeoCities, where he focuses on Web site architecture and performance issues, including networking hardware, content storage, international distribution, server operating systems, and Web server software. He is the author of the Orca monitoring system and was a key developer of the freely available Amanda backup software system. Before moving to Yahoo!/GeoCities, he received a Ph.D. in geophysics from Caltech, where he applies genetic algorithms and other numerical techniques to analyze the stress state in the earth. -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- To see the archive of others who have spoken in this seminar series, please see Past Networking Seminar Speakers --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------