Thursday, April 29, 2010

Interesting set intersection by using 'uniq'

I was asked to find the intersection of two IP lists yesterday. I write a simple program with Perl. The program iterates the two lists to find the repeated IPs, which is a O(n^2) algorithm.

foreach $a (@A) {
foreach $b (@B) {
if ($a eq $b) {
...
}
}
}
When I wake up this morning, another way to do this pops onto my mind.

cat file-A | sort | uniq > tmp
cat file-B | sort | uniq >> tmp
cat tmp | sort | uniq -d

Basically, it is a O(nlog(n)) algorithm. (the sort operation).

Wednesday, March 31, 2010

X forwarding in ssh

To enable X forwarding, try the following command
ssh -X xxx.xxx.xxx.xxx

than we can launch a program with GUI through ssh.

Sometimes it does not work as expected, check the DISPLAY environment variable.

export DISPLAY=:10.0

duplicate a new table in Oracle

create table [new table name] as select * from [table name]

Wednesday, March 17, 2010

Remove VirtualBox lock

The lock file of VirtualBox is placed in
/tmp/.vbox-[user name]-ipc

Monday, February 8, 2010

Tuesday, January 12, 2010

fighting with oci8 - apache2 + php5 + Oracle

Connecting to the Oracle database by using apache2 + php5 is simple -- if you already know all the traps along the way.

In order to connect to the Oracle database with my apache2 and php5, I've read several tutorials. Many thanks for them.

I want to talk about my experience -- everything is done correctly, but the phpinfo() still shows no oci8 information.

First, you should take a look at the error log of apache in /var/log.
In my case, my error log shows
PHP Warning: PHP Startup: Unable to load dynamic library '/usr/lib/php5/200 60613+lfs/oci8.so' - libaio.so.1: cannot open shared object file: No such fi le or directory in Unknown on line 0

The log already told me the answer, all I had to do is to install the lobaio.so. I did so and solved the problem.

Another way to discover the problem is to use the ldd command to exam the oci8.so.
linux-gate.so.1 => (0xb7fb3000)
libclntsh.so.11.1 => /opt/oracle/instantclient/libclntsh.so.11.1 (0xb627b000)
libc.so.6 => /lib/i686/cmov/libc.so.6 (0xb610e000)
libnnz11.so => /opt/oracle/instantclient/libnnz11.so (0xb5ec0000)
libdl.so.2 => /lib/i686/cmov/libdl.so.2 (0xb5ebc000)
libm.so.6 => /lib/i686/cmov/libm.so.6 (0xb5e96000)
libpthread.so.0 => /lib/i686/cmov/libpthread.so.0 (0xb5e7d000)
libnsl.so.1 => /lib/i686/cmov/libnsl.so.1 (0xb5e64000)
/lib/ld-linux.so.2 (0xb7fb4000)
libaio.so.1 => not found

It showed the same problem.

Tuesday, January 5, 2010

rrdtool - every data is a rate!

This article is about understanding the rate calculating in the rrdtool.

I'm doing a SA project recently. We get the RTTs(round-trip time) by the ping command and write the RTT to a rrd file, print out the graph, ...

More information about the rrdtool can be found on their website.
http://oss.oetiker.ch/rrdtool/

I first created a rrd file:
rrdtool create F.rrd \
--step 10 \
DS:rtt1:GAUGE:20:0:U \
RRA:MAX:0.5:1:35712

Then, I updated some values to the F.rrd file. A interesting fact is that when I update the value at exact the time step (say 10, 20, 30, etc.), I'll have a accurate value.

# rrdtool update F 1262733290:3
# rrdtool update F 1262733300:4
# rrdtool update F 1262733310:5
# rrdtool update F 1262733320:6

1262733300: 4.0000000000e+00
1262733310: 5.0000000000e+00
1262733320: 6.0000000000e+00

OK, there is a thing called "heartbeat" in the rddtool, it helps the updates that occur at a not-accurate timestamps. For example, if we update a value at time 1262733331, the rrdtool will put the value to 1262733330. It's a nice feature, since there is always some delays in the networking world.

Let's take a look

# rrdtool update F 1262733331:7
# rrdtool update F 1262733340:8

1262733330: 7.0000000000e+00
1262733340: 7.9000000000e+00

As you can see, the value 7 is updated at 1262733331 but is shifted to 126273330 in the rrd file. But what happened with the next update at 1262733340? We update 8 but the rrd file recorded 7.9. Strange, right?

Here is how 7.9 comes out: since rrdtool treats every data as a rate, if there are more than one samples before a time step, rrdtool tries to calculate the average rate of all the samples before the time step (after the previous time step, of course).

previous time step: 1262733330
when we update 7 at 1262733331; 8 at 1262733340.
the calculation is
(7 * (1262733331 - 1262733330) +
8 * (1262733340 - 1262733331) ) / 10 = 7.9

Here is a figure to help understanding:


OK, Let's do the math with another example:

# rrdtool update F 1262733341:54
# rrdtool update F 1262733342:3
# rrdtool update F 1262733347:31
# rrdtool update F 1262733350:3

To find out the average value at 1262733350:
previous time step: 1262733340
(54 * (1262733341 - 1262733340) +
3 * (1262733342 - 1262733341) +
31 * (1262733347 - 1262733342) +
3 * (1262733350 - 1262733347)) / 10 = 22.1

Read the rrd file to verify our answer:
1262733350: 2.2100000000e+01

It's a perfect match. :-)

I hope this article can help someone who has the same question as I did.

Wednesday, December 30, 2009

Config language - sqldeveloper

sqldeveloper\sqldeveloper\bin\sqldeveloper.config

AddVMOption -Duser.language=en
AddVMOption -Duser.region=US

Sunday, December 13, 2009

double fork to avoid zombie process

It is a common mistake to fork a child process without calling waitpid() to wait for the termination of the child process. Without a wait() call, the child process will become a zombie process after its termination because its parent process does not cleanup its process information in the system. A zombie process occupies a pid in the system, decrease the available pids in the system. Zombies are mark as "defunct" if you check the process by the "ps" command.

However, sometimes we do not want the parent process to wait for its child process for a long time. There is a way to achieve both "not create zombie process" and "not wait for the child process to its termination", and the way is to do a double fork.

The idea is simple, when a parent process (say A) want to fork a child process to do "something". Process A does not fork a process to do "something" directly. Process A first forks a child process (say B), and process then forks its child process (say C) to do "something" and process B terminates as soon as process C is created. In this way, process A only has to wait for process B for a short time. In the same time, since it has no parent process (process B is dead), the system will "rechild" process C to the init process. The init process calls wait() for its child process, solving the zombie process problem.

The program looks like


void func()
{
pit_t pid1;
pit_t pid2;
int status;

if (pid1 = fork()) {
/* parent process A */
waitpid(pid1, &status, NULL);
} else if (!pit1) {
/* child process B */
if (pid2 = fork()) {
exit(0);
} else if (!pid2) {
/* child process C */
execvp("something");
} else {
/* error */
}
} else {
/* error */
}
}

Monday, December 7, 2009

kill all child process

killtree () {
for child in $(ps -o pid= --ppid $1)
do
killtree $child
done
echo "kill -9 $1"
kill -9 $1 2>/dev/null
}
killtree {some pid}