<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Joe's Blog! &#187; Clustering</title>
	<atom:link href="http://www.joeandmotorboat.com/category/clustering/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.joeandmotorboat.com</link>
	<description></description>
	<lastBuildDate>Fri, 08 Jan 2010 00:00:31 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Disco.</title>
		<link>http://www.joeandmotorboat.com/2008/09/08/disco/</link>
		<comments>http://www.joeandmotorboat.com/2008/09/08/disco/#comments</comments>
		<pubDate>Mon, 08 Sep 2008 13:05:19 +0000</pubDate>
		<dc:creator>joe</dc:creator>
				<category><![CDATA[Clustering]]></category>
		<category><![CDATA[Dev]]></category>
		<category><![CDATA[Erlang]]></category>
		<category><![CDATA[Linux]]></category>

		<guid isPermaLink="false">http://www.joeandmotorboat.com/?p=610</guid>
		<description><![CDATA[Something I happened to see over here this weekend was Disco. It is a Map/Reduce framework written in Erlang. A user/implementer doesn&#8217;t need to know a lick of Erlang to get rolling but according to their site most folks use Python to write the actual jobs. If you as me a Map/Reduce framework built using [...]]]></description>
			<content:encoded><![CDATA[<p>Something I happened to see <a href="http://debasishg.blogspot.com/2008/09/more-erlang-with-disco.html">over here</a> this weekend was <a href="http://discoproject.org/">Disco</a>. It is a Map/Reduce framework written in Erlang. A user/implementer doesn&#8217;t need to know a lick of Erlang to get rolling but according to their site most folks use Python to write the actual jobs. If you as me a Map/Reduce framework built using Erlang makes a great amount of sense due to its message passing and light weight processes.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.joeandmotorboat.com/2008/09/08/disco/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>More on Hadoop Metrics In Ganglia.</title>
		<link>http://www.joeandmotorboat.com/2008/07/28/more-on-hadoop-metrics-in-ganglia/</link>
		<comments>http://www.joeandmotorboat.com/2008/07/28/more-on-hadoop-metrics-in-ganglia/#comments</comments>
		<pubDate>Thu, 01 Jan 1970 00:00:00 +0000</pubDate>
		<dc:creator>joe</dc:creator>
				<category><![CDATA[Clustering]]></category>
		<category><![CDATA[Dev]]></category>
		<category><![CDATA[Linux]]></category>

		<guid isPermaLink="false">http://www.joeandmotorboat.com/?p=511</guid>
		<description><![CDATA[I have gotten a few comments and etc regarding whether or not I was able to get Hadoop to talk to Ganglia. Sadly I wasn&#8217;t able to get this to work properly either but I did contact the Hadoop mailing list (this thread) and got the following information. There is actually a bug. The link [...]]]></description>
			<content:encoded><![CDATA[<p>I have gotten a few comments and etc regarding whether or not I was able to get Hadoop to talk to Ganglia. Sadly I wasn&#8217;t able to get this to work properly either but I did contact the Hadoop mailing list (<a href="http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200807.mbox/%3C488799B9.5070404@joetify.com%3E">this thread</a>) and got the following information. There is actually a <a href="https://issues.apache.org/jira/browse/HADOOP-3422">bug</a>. The link includes a patch but note that the trunk has changed and the patch currently only works on Hadoop version 0.16.0. I have not had a chance to test everything out yet but it is at least a step in the right direction for those of you who are curious. Hope this helps.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.joeandmotorboat.com/2008/07/28/more-on-hadoop-metrics-in-ganglia/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>More gexec.</title>
		<link>http://www.joeandmotorboat.com/2008/06/04/more-gexec/</link>
		<comments>http://www.joeandmotorboat.com/2008/06/04/more-gexec/#comments</comments>
		<pubDate>Thu, 01 Jan 1970 00:00:00 +0000</pubDate>
		<dc:creator>joe</dc:creator>
				<category><![CDATA[Clustering]]></category>

		<guid isPermaLink="false">http://www.joeandmotorboat.com/?p=491</guid>
		<description><![CDATA[Bernard Li pushed a new version of gexec out in response to my inquiries on the mailing list, it includes the Ganglia switch.
I did some further code changes and was able to generate a tarball
which builds fine without any modification, can you please try it on
your system and see if it works?  All you [...]]]></description>
			<content:encoded><![CDATA[<p>Bernard Li pushed a new version of gexec out in response to <a href="http://sourceforge.net/mailarchive/message.php?msg_name=d4c731da0806021752u7c9b1feete4efe3f3473290b5%40mail.gmail.com">my inquiries on the mailing list</a>, it includes the Ganglia switch.</p>
<blockquote><p>I did some further code changes and was able to generate a tarball<br />
which builds fine without any modification, can you please try it on<br />
your system and see if it works?  All you need to is run `rpmbuild<br />
-tb` against the tarball:</p>
<p><a href="http://therealms.org/oss/ganglia/gexec-0.3.8.1375.tar.gz">http://therealms.org/oss/ganglia/gexec-0.3.8.1375.tar.gz</a>
</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://www.joeandmotorboat.com/2008/06/04/more-gexec/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>NVIDIA GPU/CUDA Based Supercomputer.</title>
		<link>http://www.joeandmotorboat.com/2008/05/31/nvidia-gpucuda-based-supercomputer/</link>
		<comments>http://www.joeandmotorboat.com/2008/05/31/nvidia-gpucuda-based-supercomputer/#comments</comments>
		<pubDate>Thu, 01 Jan 1970 00:00:00 +0000</pubDate>
		<dc:creator>joe</dc:creator>
				<category><![CDATA[Clustering]]></category>

		<guid isPermaLink="false">http://www.joeandmotorboat.com/?p=488</guid>
		<description><![CDATA[Check out this sweet machine that the University of Antwerp built.
]]></description>
			<content:encoded><![CDATA[<p>Check out this <a href="http://www.dvhardware.net/article27538.html">sweet machine</a> that the University of Antwerp built.</p>
<p><a href="http://www.joeandmotorboat.com/2008/05/31/nvidia-gpucuda-based-supercomputer/"><em>Click here to view the embedded video.</em></a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.joeandmotorboat.com/2008/05/31/nvidia-gpucuda-based-supercomputer/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Ganglia, gexec, authd and libe Install Procedure.</title>
		<link>http://www.joeandmotorboat.com/2008/05/30/ganglia-gexec-authd-and-libe-install-procedure/</link>
		<comments>http://www.joeandmotorboat.com/2008/05/30/ganglia-gexec-authd-and-libe-install-procedure/#comments</comments>
		<pubDate>Thu, 01 Jan 1970 00:00:00 +0000</pubDate>
		<dc:creator>joe</dc:creator>
				<category><![CDATA[Clustering]]></category>
		<category><![CDATA[Linux]]></category>

		<guid isPermaLink="false">http://www.joeandmotorboat.com/?p=486</guid>
		<description><![CDATA[Install Ganglia

wget http://voxel.dl.sourceforge.net/sourceforge/ganglia/ganglia-3.0.7-1.src.rpm
rpm -Uhv http://apt.sw.be/redhat/el5/en/i386/rpmforge/RPMS/rpmforge-release-0.3.6-1.el5.rf.i386.rpm
yum install libpng-devel libart_lgpl-devel rrdtool-devel freetype-devel rrdtool-devel
rpmbuild &#8211;rebuild ganglia-3.0.7-1.src.rpm
rpm -ivh /usr/src/redhat/RPMS/x86_64/ganglia-gmetad-3.0.7-1.x86_64.rpm /usr/src/redhat/RPMS/x86_64/ganglia-gmond-3.0.7-1.x86_64.rpm /usr/src/redhat/RPMS/x86_64/ganglia-devel-3.0.7-1.x86_64.rpm
Install libe
wget http://www.theether.org/libe/libe-0.3.0-1.src.rpm
rpmbuild &#8211;rebuild libe-0.3.0-1.src.rpm
rpm -ivh /usr/src/redhat/RPMS/x86_64/libe-0.3.0-1.x86_64.rpm 
Install authd
yum install openssl-devel
wget http://www.theether.org/authd/authd-0.2.2-1.src.rpm
rpmbuild &#8211;rebuild authd-0.2.2-1.src.rpm 
You will run into an error like the following, don&#8217;t worry about it we clean it up next.
Installing authd-0.2.2-1.src.rpm
warning: user bnc does not exist &#8211; [...]]]></description>
			<content:encoded><![CDATA[<p>Install Ganglia</p>
<blockquote><p>
wget http://voxel.dl.sourceforge.net/sourceforge/ganglia/ganglia-3.0.7-1.src.rpm<br />
rpm -Uhv http://apt.sw.be/redhat/el5/en/i386/rpmforge/RPMS/rpmforge-release-0.3.6-1.el5.rf.i386.rpm<br />
yum install libpng-devel libart_lgpl-devel rrdtool-devel freetype-devel rrdtool-devel<br />
rpmbuild &#8211;rebuild ganglia-3.0.7-1.src.rpm<br />
rpm -ivh /usr/src/redhat/RPMS/x86_64/ganglia-gmetad-3.0.7-1.x86_64.rpm /usr/src/redhat/RPMS/x86_64/ganglia-gmond-3.0.7-1.x86_64.rpm /usr/src/redhat/RPMS/x86_64/ganglia-devel-3.0.7-1.x86_64.rpm</p></blockquote>
<p>Install libe</p>
<blockquote><p>wget http://www.theether.org/libe/libe-0.3.0-1.src.rpm<br />
rpmbuild &#8211;rebuild libe-0.3.0-1.src.rpm<br />
rpm -ivh /usr/src/redhat/RPMS/x86_64/libe-0.3.0-1.x86_64.rpm </p></blockquote>
<p>Install authd</p>
<blockquote><p>yum install openssl-devel<br />
wget http://www.theether.org/authd/authd-0.2.2-1.src.rpm<br />
rpmbuild &#8211;rebuild authd-0.2.2-1.src.rpm </p></blockquote>
<p>You will run into an error like the following, don&#8217;t worry about it we clean it up next.</p>
<blockquote><p>Installing authd-0.2.2-1.src.rpm<br />
warning: user bnc does not exist &#8211; using root<br />
warning: group dusers does not exist &#8211; using root<br />
error: Legacy syntax is unsupported: copyright<br />
error: line 5: Unknown tag: Copyright: GPL</p></blockquote>
<p>Finish up authd</p>
<blockquote><p>
mv /usr/src/redhat/SPECS/authd.spec /usr/src/redhat/SPECS/authd.spec.1<br />
sed &#8217;s/Copyright/License/g&#8217; /usr/src/redhat/SPECS/authd.spec.1 > /usr/src/redhat/SPECS/authd.spec<br />
rpmbuild -ba /usr/src/redhat/SPECS/authd.spec<br />
openssl genrsa -out auth_priv.pem<br />
chmod 600 auth_priv.pem<br />
openssl rsa -in auth_priv.pem -pubout -out auth_pub.pem
</p></blockquote>
<p>Copy auth_priv.pem and auth_pub.pem to &#8216;/etc&#8217; on each node of the cluster</p>
<blockquote><p>
rpm -ivh /usr/src/redhat/RPMS/x86_64/authd-0.2.2-1.x86_64.rpm
</p></blockquote>
<p>Installing gexec (<a href="http://www.joeandmotorboat.com/files/gexec-0.3.8-4.src.rpm">using my SRPM</a>, includes the &#8216;&#8211;with-ganglia&#8217; option)</p>
<blockquote><p>echo &#8220;gexec   2875/tcp    # Caltech GEXEC&#8221; >> /etc/services<br />
yum install glibc gcc gcc-c++ authd expat-devel<br />
rpm -ivh /usr/src/redhat/RPMS/x86_64/gexec-0.3.8-4.x86_64.rpm</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://www.joeandmotorboat.com/2008/05/30/ganglia-gexec-authd-and-libe-install-procedure/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>gexec Success!</title>
		<link>http://www.joeandmotorboat.com/2008/05/30/gexec-success/</link>
		<comments>http://www.joeandmotorboat.com/2008/05/30/gexec-success/#comments</comments>
		<pubDate>Thu, 01 Jan 1970 00:00:00 +0000</pubDate>
		<dc:creator>joe</dc:creator>
				<category><![CDATA[Clustering]]></category>
		<category><![CDATA[Linux]]></category>

		<guid isPermaLink="false">http://www.joeandmotorboat.com/?p=485</guid>
		<description><![CDATA[I was finally able to get a clean build of gexec with the &#8216;&#8211;with-ganglia&#8217; option. Here&#8217;s what I did:
I downloaded the tarball available at http://therealms.org/oss/ganglia/gexec-0.3.8.tar.gz (thanks to Bernard on the Ganglia mailing list). Then run:
rpmbuild -tb gexec-0.3.8.tar.gz
This created a RPM and SRPM, the RPM can be deleted and I installed the SRPM. Should be located [...]]]></description>
			<content:encoded><![CDATA[<p>I was finally able to get a clean build of gexec with the &#8216;&#8211;with-ganglia&#8217; option. Here&#8217;s what I did:</p>
<p>I downloaded the tarball available at http://therealms.org/oss/ganglia/gexec-0.3.8.tar.gz <em>(thanks to Bernard on the Ganglia mailing list)</em>. Then run:</p>
<blockquote><p>rpmbuild -tb gexec-0.3.8.tar.gz</p></blockquote>
<p>This created a RPM and SRPM, the RPM can be deleted and I installed the SRPM. Should be located at &#8216;/usr/src/redhat/SRPMS/gexec-0.3.8-4.src.rpm&#8217;. I then edited the SPEC file &#8216;/usr/src/redhat/SPECS/gexec.spec&#8217; removing &#8216;%configure&#8217; and adding the following above the &#8216;make&#8217; line but below the &#8216;%build&#8217; line.</p>
<blockquote><p>./configure &#8211;with-ganglia &#8211;host=x86_64-redhat-linux-gnu &#8211;build=x86_64-redhat-linux-gnu &#8211;target=x86_64-redhat-linux &#8211;program-prefix= &#8211;prefix=/usr &#8211;exec-prefix=/usr &#8211;bindir=/usr/bin &#8211;sbindir=/usr/sbin &#8211;sysconfdir=/etc &#8211;datadir=/usr/share &#8211;includedir=/usr/include &#8211;libdir=/usr/lib64 &#8211;libexecdir=/usr/libexec &#8211;localstatedir=/var &#8211;sharedstatedir=/usr/com &#8211;mandir=/usr/share/man &#8211;infodir=/usr/share/info</p></blockquote>
<p>Next, extract the tarball at &#8216;/usr/src/redhat/SOURCES/gexec-0.3.8.tar.gz&#8217;. Edit &#8216;configure.ac&#8217; to include &#8216;AC_PREFIX_DEFAULT(/usr)&#8217; rather than &#8216;AC_PREFIX_DEFAULT(/usr/local)&#8217;. Then change GANGLIA_LIB to use &#8216;/usr/lib/libganglia.a&#8217; rather than &#8216;@libdir@/libganglia.a&#8217;. I also edited the Makefile to use &#8216;/usr/lib/libganglia.a&#8217; rather than &#8216;@libdir@/libganglia.a&#8217; in a couple spots. Then move the gexec-0.3.8.tar.gz to gexec-0.3.8.tar.gz.OLD and &#8216;tar zcvf gexec-0.3.8&#8242; to create a new tarball with the changes just made. At this point one can build and install the new RPM by running:</p>
<blockquote><p>
rpmbuild -ba /usr/src/redhat/SPECS/gexec.spec<br />
rpm -ivh /usr/src/redhat/RPMS/x86_64/gexec-0.3.8-4.x86_64.rpm
</p></blockquote>
<p>I have made my SRPM available, you can download it <a href="http://www.joeandmotorboat.com/files/gexec-0.3.8-4.src.rpm ">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.joeandmotorboat.com/2008/05/30/gexec-success/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Is It You Or Me Ganglia?</title>
		<link>http://www.joeandmotorboat.com/2008/05/29/is-it-you-or-me-ganglia/</link>
		<comments>http://www.joeandmotorboat.com/2008/05/29/is-it-you-or-me-ganglia/#comments</comments>
		<pubDate>Thu, 01 Jan 1970 00:00:00 +0000</pubDate>
		<dc:creator>joe</dc:creator>
				<category><![CDATA[Clustering]]></category>

		<guid isPermaLink="false">http://www.joeandmotorboat.com/?p=484</guid>
		<description><![CDATA[So I began building a new head cluster node in a KVM, just as a test run and to refine my methodology. I decided to drop Unicluster due to an unresolved issue, this time around I decided to install everything myself. &#8230; Java, check &#8230; Hadoop, check &#8230; Pig, check &#8230; Grid Engine, check &#8230; [...]]]></description>
			<content:encoded><![CDATA[<p>So I began building a new head cluster node in a <a href="http://en.wikipedia.org/wiki/Kernel-based_Virtual_Machine">KVM</a>, just as a test run and to refine my methodology. I decided to drop Unicluster due to an <a href="http://www.grid.org/forum/showthread.php?t=160">unresolved issue</a>, this time around I decided to install everything myself. &#8230; Java, check &#8230; Hadoop, check &#8230; Pig, check &#8230; Grid Engine, check &#8230; OpenMPI, check &#8230; Ganglia, ugh &#8230;</p>
<p>Ganglia seems to be an interesting beast. I build the SRPMs and then installed the RPMs for the &#8220;ganglia monitor core&#8221; without a problem, it was easy and quick. I then moved on to the &#8220;gexec execution environment&#8221; this includes gexec, gexecd, authd and libe.</p>
<p>The first issue I ran into in building from the SRPM was the dependencies. First, I started with authd and ran into dependency issues during the build. Sadly the SPEC file did not include what the package requires. I attempted the normal RPM (found on Ganglia&#8217; <a href="http://sourceforge.net/project/showfiles.php?group_id=43021&amp;package_id=36388&amp;release_id=88941">SourceForge</a> page). Even those didn&#8217;t work properly due to a requirement of some old OpenSSL libraries unavailable in Centos5.</p>
<blockquote><p>[root@m ganglia]# rpm -qa | grep openssl<br />
openssl-devel-0.9.8b-8.3.el5_0.2<br />
openssl-0.9.8b-8.3.el5_0.2<br />
openssl-devel-0.9.8b-8.3.el5_0.2<br />
openssl-0.9.8b-8.3.el5_0.2<br />
[root@m ganglia]# rpm -ivh authd-0.2.1-1.i386.rpm<br />
error: Failed dependencies:<br />
libcrypto.so.2 is needed by authd-0.2.1-1.i386<br />
libssl.so.2 is needed by authd-0.2.1-1.i386</p></blockquote>
<p>So I went back to attempting to build the SRPM. <a href="http://www.mail-archive.com/ganglia-general@lists.sourceforge.net/msg03846.html">Soon I found out</a> that the above libraries have nothing to do with the build issues I was seeing. My issue was with the libe library missing. Once I built and installed that authd build and installed without a problem.</p>
<p>Next, I attempted to build gexec. This proved to have the same issue as authd, the SRPM did not include a requires in the SPEC making it difficult to determine what needs to be installed as a prerequisite. I then started to investigate the errors I was seeing in the build,</p>
<blockquote><p>gexec.c:39:33: error: ganglia/gexec_funcs.h: No such file or directory</p></blockquote>
<p>Googling for this I found a <a href="http://www.mail-archive.com/ganglia-developers@lists.sourceforge.net/msg02443.html">Ganglia Developers email list entry</a> that described that</p>
<blockquote><p>The gexec-0.3.6 available from http://www.theether.org/gexec does not<br />
build with 3.0.* versions of Ganglia. It builds correctly only with 2.*<br />
versions. If you want to build with Ganglia 3, edit the gexec.c to include<br />
/usr/include/ganglia.h and not /usr/include/ganglia/gexec_funcs.h. Of<br />
course, you have to have ganglia-devel installed for this to work. Another<br />
thing, in addition to the above, you have to add #include  to<br />
gexec.c in order to successfully build the gexec.</p></blockquote>
<p>That works, so I edited the gexec.c source tarball containing the gexec.c including the above changes. My attempt to build again failed on the &#8216;e/llist.h&#8217; include not existing. &#8216;locate&#8217; proved that it did not exist on my machine even though libe is installed. So I went back to that email list post and found this link:</p>
<blockquote><p>http://svn.oscar.openclustergroup.org/svn/oscar-soc/soc-2006/hpcmetrics/ganglia/</p></blockquote>
<p>Looking through the source I found http://svn.oscar.openclustergroup.org/svn/oscar-soc/soc-2006/hpcmetrics/ganglia/src/lib/llist.h and copied it in to &#8216;/usr/include/e/&#8217;. This worked nicely, but as you might expect it failed again. This time looking for libraries in &#8216;/lib&#8217; rather than &#8216;/lib64&#8242;, which is to be expected since I am running x86_64. I symlinked the library into place and moved on.</p>
<p>Now I am at an error that I haven&#8217;t been able to figure out. <a href="http://www.mail-archive.com/ganglia-general@lists.sourceforge.net/msg03849.html">My mailing list post</a> describing the issue has not seen a reply.</p>
<blockquote><p>gexec.c: In function ‘main’:<br />
gexec.c:324: warning: ‘ips’ may be used uninitialized in this function<br />
gcc -DHAVE_CONFIG_H -I. -I. -I. -I.    -O2 -Wall -D_REENTRANT -g<br />
-D_GNU_SOURCE -DDEBUG -c gexec_options.c<br />
gcc  -O2 -Wall -D_REENTRANT -g -D_GNU_SOURCE -DDEBUG  -o gexec -L.<br />
gexec.o gexec_options.o -lpthread -lgexec -le -lauth -lssl -lcrypto<br />
/usr/lib/libganglia.a -lssl -lpthread -lcrypto<br />
/usr/lib/libganglia.a(ganglia.o): In function `gexec_cluster&#8217;:<br />
(.text+0&#215;10c): undefined reference to `XML_ParserCreate&#8217;<br />
/usr/lib/libganglia.a(ganglia.o): In function `gexec_cluster&#8217;:<br />
(.text+0&#215;160): undefined reference to `XML_SetElementHandler&#8217;<br />
/usr/lib/libganglia.a(ganglia.o): In function `gexec_cluster&#8217;:<br />
(.text+0&#215;16b): undefined reference to `XML_SetUserData&#8217;<br />
/usr/lib/libganglia.a(ganglia.o): In function `gexec_cluster&#8217;:<br />
(.text+0&#215;178): undefined reference to `XML_GetBuffer&#8217;<br />
/usr/lib/libganglia.a(ganglia.o): In function `gexec_cluster&#8217;:<br />
(.text+0&#215;1c4): undefined reference to `XML_ParserFree&#8217;<br />
/usr/lib/libganglia.a(ganglia.o): In function `gexec_cluster&#8217;:<br />
(.text+0&#215;1f6): undefined reference to `XML_ParseBuffer&#8217;<br />
/usr/lib/libganglia.a(ganglia.o): In function `gexec_cluster&#8217;:<br />
(.text+0&#215;265): undefined reference to `XML_GetErrorCode&#8217;<br />
/usr/lib/libganglia.a(ganglia.o): In function `gexec_cluster&#8217;:<br />
(.text+0&#215;26c): undefined reference to `XML_ErrorString&#8217;<br />
/usr/lib/libganglia.a(ganglia.o): In function `gexec_cluster&#8217;:<br />
(.text+0&#215;277): undefined reference to `XML_GetCurrentLineNumber&#8217;<br />
collect2: ld returned 1 exit status<br />
make: *** [gexec] Error 1</p></blockquote>
<p>After a bit of Googling, I found that these XML directives are related to expat. I installed expat-devel (as well as a number of other xml devel packages) and attempted to rebuild. Same thing, failure. Next, I decided that since it seems in relation to libganglia.a that perhaps it was not built with expat support and needed to rebuilt, so now with expat-devel installed I did this. This fails with the same error as above. After looking at the <a href="http://ganglia.wiki.sourceforge.net/ganglia_readme">doc</a> I noticed that the ganglia SPEC file does not include &#8216;&#8211;enable-gexec&#8217; in the configure. I built the RPMs with this option and still ran into the error. I have attempted to build gexec from SRPM as well as straight source. In every case I get the above error. The error suggests (&#8220;collect2: ld returned 1 exit status&#8221;) to me that there is a library (or libraries) missing. But at this point I&#8217;m not really sure at all. If I come up with something (outside of running gexec in standalone) I will be sure to post it. If anyone else out there knows what&#8217;s up post a comment.</p>
<p>This all leads me to the point of this post which is &#8230; <em>why is setting this up so difficult</em>? Truth be told I have no clue, but I don&#8217;t think it should be. The Ganglia mailing list was helpful enough but documentation seems a little lacking should one run into any issues. One would think that if &#8220;The gexec-0.3.6 available from http://www.theether.org/gexec does not<br />
build with 3.0.* versions of Ganglia.&#8221; this should be documented. I don&#8217;t think that I am doing anything strange and I am using Centos5, not some obscure distro.</p>
<p>You may be asking what all these problems with gexec have to do with ganglia (a guy on the mailing list asked me just that &#8220;What does this have to do with ganglia?&#8221;), fair enough. Ganglia is not gexec and gexec is not Ganglia. My response was that the gexec SRPMs are downloadable side by site with all the Ganglia RPMs off of SourceForge. This leads me to believe that questions to the Ganglia mailing list about gexec doesn&#8217;t seem too far off base. Additionally, for someone that is trying to install these packages for the first time or is new to Ganglia it seems that the mailing list would be the place to ask, as I imagine there are plenty of folks running gexec hosts in Ganglia. The Ganglia documentation even mentions gexec that &#8220;integrating it with ganglia is a bit clumsy&#8221; but provides no information outside of how to run it standalone mode and how to turn it off if you have configured it by default to be on. To boot the gexec site hasn&#8217;t been updated since 2004.</p>
<p>Next, you may think that if this is broken and the documentation sucks why don&#8217;t you fix it, it&#8217;s an opensource project. That&#8217;s valid and I will be happy to write up some documentation on how to build the RPMs for Ganglia and associated applications. For good measure I will even see if I can get it posted to the Ganglia wiki. Of course this hinges on me actually being able to build the RPMs and have everything work properly.</p>
<p>Lastly, here are a few lessons learned:</p>
<ul>
<li>Something I learn time and time again, don&#8217;t assume anything.</li>
<li>Any time you create SRPMs make sure you add  the &#8220;BuildRequires&#8221; directive. This alone would have likely solved my issue with gexec after I modified gexec.c or at least would have pointed me in the right direction.</li>
<li>If source code modifications are required or any other oddities in building an application document them, simply something is clunky or unintuitive is not enough.</li>
<li>If you have a software product you would like other people to use provide installation procedures. Having install docs is almost as good as having a marketing team. If people find it easy to install and are happy with it they will tell others (example: Wordpress).</li>
</ul>
<p>That&#8217;s it for my rant. Thanks. <img src='http://www.joeandmotorboat.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.joeandmotorboat.com/2008/05/29/is-it-you-or-me-ganglia/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>More Hadoop, Grid Engine Goodness.</title>
		<link>http://www.joeandmotorboat.com/2008/05/23/more-hadoop-grid-engine-goodness/</link>
		<comments>http://www.joeandmotorboat.com/2008/05/23/more-hadoop-grid-engine-goodness/#comments</comments>
		<pubDate>Thu, 01 Jan 1970 00:00:00 +0000</pubDate>
		<dc:creator>joe</dc:creator>
				<category><![CDATA[Clustering]]></category>

		<guid isPermaLink="false">http://www.joeandmotorboat.com/?p=482</guid>
		<description><![CDATA[Over at GridEngine.info they found a link on DanT&#8217;s Sun blog that has a sweet tutorial on setting up Hadoop using SGE&#8217;s parallel environments with loose integration.
Here we are relying on master node to start othe daemons ( [rs]sh the machine and start daemons) and distribute jobs , and we donot have control on the [...]]]></description>
			<content:encoded><![CDATA[<p>Over at <a href="http://gridengine.info/articles/2008/05/23/creating-hadoop-pe-under-grid-engine">GridEngine.info</a> they found a link on <a href="http://blogs.sun.com/templedf/entry/hadoop_sun_grid_engine">DanT&#8217;s Sun blog</a> that has a sweet tutorial on <a href="http://blogs.sun.com/ravee/entry/creating_hadoop_pe_under_sge">setting up Hadoop using SGE&#8217;s parallel environments</a> with loose integration.</p>
<blockquote><p>Here we are relying on master node to start othe daemons ( [rs]sh the machine and start daemons) and distribute jobs , and we donot have control on the <em>TaskTracker</em> threads. This way of setting a pe in Grid Engine is called <a title="SGE Loose integration" href="http://gridengine.sunsource.net/howto/howto.html">loose-integration</a></p>
<p>With some more effort one could also achieve a <strong>tighter integration</strong> wherein the task of starting daemons and tasks on other slaves could be done by SGE. But this would require further understanding of Hadoop internals.</p></blockquote>
<p>Pretty dope.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.joeandmotorboat.com/2008/05/23/more-hadoop-grid-engine-goodness/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Using Pig with Hadoop.</title>
		<link>http://www.joeandmotorboat.com/2008/05/23/using-pig-with-hadoop/</link>
		<comments>http://www.joeandmotorboat.com/2008/05/23/using-pig-with-hadoop/#comments</comments>
		<pubDate>Thu, 01 Jan 1970 00:00:00 +0000</pubDate>
		<dc:creator>joe</dc:creator>
				<category><![CDATA[Clustering]]></category>

		<guid isPermaLink="false">http://www.joeandmotorboat.com/?p=481</guid>
		<description><![CDATA[Pig is a query language for use with Hadoop. It allows users to query hadoop data similar to a SQL database. Formally, according to their website:
Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. The salient property [...]]]></description>
			<content:encoded><![CDATA[<p>Pig is a query language for use with Hadoop. It allows users to query hadoop data similar to a SQL database. Formally, according to their website:</p>
<blockquote><p>Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets.</p></blockquote>
<p>To get rolling you need the following:</p>
<ul>
<li> A Java SDK Installed</li>
<li>Ant Installed</li>
<li>Subversion</li>
<li>A working installation of Hadoop</li>
</ul>
<p>Once you are rolling with those items we can install Pig and test it out.</p>
<p>First, you need to download Pig from their Subversion repository. Once done you will need to build it with Ant.</p>
<blockquote><p>svn co http://svn.apache.org/repos/asf/incubator/pig/trunk pig-svn<br />
cd pig-svn<br />
ant</p></blockquote>
<p>From there you can run the following command to drop into the interactive shell.</p>
<blockquote><p>java -cp pig.jar:HADOOPSITEPATH org.apache.pig.Main</p></blockquote>
<p>Or you can run a pig script that you have already created.</p>
<blockquote><p>java -cp pig.jar:HADOOPSITEPATH somescript.pig</p></blockquote>
<p>HADOOPSITEPATH needs to point to the directory that contains the hadoop-site.xml file.</p>
<p>If you run into an issue such as:</p>
<blockquote><p>Caused by: org.apache.hadoop.ipc.RPC$VersionMismatch: Protocol org.apache.hadoop.dfs.ClientProtocol version mismatch. (client = 29, server = 23)</p></blockquote>
<p>You will need to upgrade Hadoop so the versions match.</p>
<p>In the end you should get something that looks like this:</p>
<blockquote><p>[cluster@front pig-svn]$ java -cp pig.jar:HADOOPSITEPATH org.apache.pig.Main<br />
2008-05-23 10:37:42,478 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine &#8211; Connecting to hadoop file system at: front.esper:9000<br />
2008-05-23 10:37:42,585 [main] WARN  org.apache.hadoop.fs.FileSystem &#8211; &#8220;front.esper:9000&#8243; is a deprecated filesystem name. Use &#8220;hdfs://front.esper:9000/&#8221; instead.<br />
2008-05-23 10:37:43,117 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine &#8211; Connecting to map-reduce job tracker at: front.esper:9001<br />
2008-05-23 10:37:43,246 [main] WARN  org.apache.hadoop.fs.FileSystem &#8211; &#8220;front.esper:9000&#8243; is a deprecated filesystem name. Use &#8220;hdfs://front.esper:9000/&#8221; instead.<br />
grunt&gt;</p></blockquote>
<p>If you need more info on the above steps check out the <a href="http://wiki.apache.org/pig/GettingStarted">Pig Wiki</a>.</p>
<p>From here you can follow their <a href="http://wiki.apache.org/pig/PigTutorial">tutorial</a> or play around in the <a href="http://wiki.apache.org/pig/Grunt">shell</a>. Regarding the tutorial, I can&#8217;t seem to find the download of the archive they mention &#8220;Pig tutorial file (*.gz)&#8221;. If anyone knows where that can be found let me know and I will post it.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.joeandmotorboat.com/2008/05/23/using-pig-with-hadoop/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Map Reduce and MPI.</title>
		<link>http://www.joeandmotorboat.com/2008/04/30/map-reduce-and-mpi/</link>
		<comments>http://www.joeandmotorboat.com/2008/04/30/map-reduce-and-mpi/#comments</comments>
		<pubDate>Thu, 01 Jan 1970 00:00:00 +0000</pubDate>
		<dc:creator>joe</dc:creator>
				<category><![CDATA[Clustering]]></category>

		<guid isPermaLink="false">http://www.joeandmotorboat.com/?p=478</guid>
		<description><![CDATA[Over at GridGuru&#8217;s they have a interesting article regarding Map Reduce its applications. The Map Reduce crowd has been growing of late and is out spoken about what a great tool it is. Without a doubt it is, but something I learned a long time ago is that for each job there is a correct [...]]]></description>
			<content:encoded><![CDATA[<p>Over at GridGuru&#8217;s they have a <a href="http://gridgurus.typepad.com/grid_gurus/2008/04/the-mapreduce-p.html">interesting article</a> regarding Map Reduce its applications. The Map Reduce crowd has been growing of late and is out spoken about what a great tool it is. Without a doubt it is, but something I learned a long time ago is that for each job there is a correct tool. You don&#8217;t use a sledgehammer to fix your watch and you don&#8217;t use a pair of tweezers for demolition.</p>
<blockquote><p>I am a skeptic, which is not to say I have anything against a generalized framework for distributing data to a large number of processors. Nor does it imply that I enjoy MPI and its coherence arising from cacophonous chatter (if all goes well). I just don’t think MapReduce is particularly &#8220;simple&#8221;. The key promoters of this algorithm such as Yahoo and Google have serious-experts MapReducing their particular problem sets and thus they make it look easy.</p>
<p>&#8230;</p>
<p>Sadly this implies that processing data in parallel is still hard no matter how good of a programmer you are nor how sophisticated your programming language is.</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://www.joeandmotorboat.com/2008/04/30/map-reduce-and-mpi/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
