<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>tdistler.com</title>
	<atom:link href="http://tdistler.com/feed" rel="self" type="application/rss+xml" />
	<link>http://tdistler.com</link>
	<description>&#34;To err is human, but to really foul things up you need a computer.” (Paul Ehrlich)</description>
	<lastBuildDate>Thu, 22 Jul 2010 19:01:38 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Audio Resampling Using FFMpeg (avcodec)</title>
		<link>http://tdistler.com/2010/07/22/audio-resampling-using-ffmpeg-avcodec</link>
		<comments>http://tdistler.com/2010/07/22/audio-resampling-using-ffmpeg-avcodec#comments</comments>
		<pubDate>Thu, 22 Jul 2010 19:01:38 +0000</pubDate>
		<dc:creator>Tom</dc:creator>
				<category><![CDATA[Code Monkey]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[software]]></category>
		<category><![CDATA[video]]></category>

		<guid isPermaLink="false">http://tdistler.com/?p=427</guid>
		<description><![CDATA[FFMpeg provides many powerful features for processing audio and video. One cool thing it can do is resample an audio stream. This allows you to convert, say, a 44.1kHz audio stream down to 8kHz, or up to 48kHz. What&#8217;s more, FFMpeg can do the conversion to any arbitrary sample rate. This allows you to do [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://tdistler.com/wp-content/uploads/2010/07/ffmpeg_logo.jpg"><img class="alignright size-full wp-image-433" title="ffmpeg_logo" src="http://tdistler.com/wp-content/uploads/2010/07/ffmpeg_logo.jpg" alt="" width="150" height="37" /></a><a title="FFMpeg" href="http://www.ffmpeg.org/" target="_blank">FFMpeg</a> provides many powerful features for processing audio and video. One cool thing it can do is resample an audio stream. This allows you to convert, say, a 44.1kHz audio stream down to 8kHz, or up to 48kHz. What&#8217;s more, FFMpeg can do the conversion to any arbitrary sample rate. This allows you to do cool things like smoothly changing the audio playback speed over time (see sample code below).</p>
<p>There are many pages describing how to resample audio using the ffmpeg command line application, but what about doing resampling in your own program? To do that, you need to use the avcodec library (<code>libavcodec.so</code> on Linux and <code>avcodec.dll</code> on Windows).</p>
<ol>
<li>Include <code>avcodec.h</code></li>
<li>Call <code>avcodec_init()</code> to initialize the FFMpeg library.</li>
<li>Create a resampling context using <code>av_resample_init()</code> that describes how you want the resampling done.</li>
<li>Call <code>av_resample()</code> to do the actual resampling on your audio buffer.</li>
<li>When you&#8217;re done with the resampling context, delete it with <code>av_resample_close()</code>.</li>
<li>Finally, link your application against <code>avcodec</code>, <code>avutil</code>, and <code><a title="zlib Library" href="http://www.zlib.net/" target="_blank">zlib</a></code> (it won&#8217;t work on Linux without this one).</li>
</ol>
<p>Here it is in pseudocode:</p>
<blockquote><p><code>#include "libavcodec/avcodec.h"</code></p>
<p><code>avcodec_init();</code></p>
<p><code>struct AVResampleContext* ctx = av_resample_init( ... );</code></p>
<p><code>av_resample( ctx, ... );</code></p>
<p><code> </code></p>
<p><code>av_resample_close( ctx );</code></p></blockquote>
<p>That&#8217;s it&#8230; seriously!</p>
<p><strong>Sample Code (Linux):</strong></p>
<p>Here&#8217;s a sample program I wrote that takes a raw 44.1kHz/16bit/mono audio file and plays it back using the <a title="Pulse Audio" href="http://pulseaudio.org/" target="_blank">pulseaudio</a> API. The catch is that it allows you to specify a &#8220;skew&#8221; parameter which will cause the audio to dynamically speed up and slow down (via resampling). The amount of resampling is controlled by a sine wave, which is what drives the speed changes.</p>
<p>Download:<strong> </strong><a title="Resample tar ball" href="media/code/resample.tar.bz" target="_blank"><strong>resample.tar.bz</strong></a></p>
<p>To unpack and build, type:</p>
<blockquote><p><code>$ tar -xjvf resample.tar.bz<br />
$ make</code></p></blockquote>
<p>First, run the sample with no skew:</p>
<blockquote><p><code>$ ./resample audio_16b_44k_mono_pcm_raw 0</code></p></blockquote>
<p>Now, try it with a heavy skew:</p>
<blockquote><p><code>$ ./resample audio_16b_44k_mono_pcm_raw -10000</code></p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://tdistler.com/2010/07/22/audio-resampling-using-ffmpeg-avcodec/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Video Tearing, V-Sync, and Triple-Buffering</title>
		<link>http://tdistler.com/2010/07/18/video-tearing-v-sync-and-triple-buffering</link>
		<comments>http://tdistler.com/2010/07/18/video-tearing-v-sync-and-triple-buffering#comments</comments>
		<pubDate>Sun, 18 Jul 2010 20:45:26 +0000</pubDate>
		<dc:creator>Tom</dc:creator>
				<category><![CDATA[Code Monkey]]></category>
		<category><![CDATA[development]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[software]]></category>
		<category><![CDATA[video]]></category>

		<guid isPermaLink="false">http://tdistler.com/?p=416</guid>
		<description><![CDATA[When rendering graphics or video to the screen, it&#8217;s important to understand the display process; in particular, vertical sync (v-sync). A common problem when starting out is an issue called &#8220;tearing&#8221;&#8230; where the video appears to be torn horizontally down the middle (see picture). I&#8217;ve been looking for a good explanation to share about why [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://tdistler.com/wp-content/uploads/2010/07/v_sync.jpg"><img class="alignright size-full wp-image-419" title="v_sync" src="http://tdistler.com/wp-content/uploads/2010/07/v_sync.jpg" alt="" width="150" height="113" /></a>When rendering graphics or video to the screen, it&#8217;s important to understand the display process; in particular, vertical sync (v-sync). A common problem when starting out is an issue called &#8220;tearing&#8221;&#8230; where the video appears to be torn horizontally down the middle (see picture). I&#8217;ve been looking for a good explanation to share about why video tearing occurs and how to solve it (from a technical perspective). I found the following thread over at <a title="Hard Forum" href="http://hardforum.com" target="_blank">[H]ard|Forum</a>, which I think does a pretty good job.</p>
<p>For the thread &#8220;<a title="How VSync works, and why people loathe it" href="http://hardforum.com/showthread.php?s=9eb741eaa733ffb028c860e9805c04f5&amp;t=928593" target="_blank">How </a><a title="How VSync works, and why people loathe it" href="http://hardforum.com/showthread.php?s=9eb741eaa733ffb028c860e9805c04f5&amp;t=928593" target="_blank">VSync</a><a title="How VSync works, and why people loathe it" href="http://hardforum.com/showthread.php?s=9eb741eaa733ffb028c860e9805c04f5&amp;t=928593" target="_blank"> works, and why people loathe it</a>&#8220;:</p>
<blockquote><p>There is a technique called triple-buffering that solves this VSync problem. Lets go back to our 50FPS, 75Hz example. Frame 1 is in the frame buffer, and 2/3 of frame 2 are drawn in the back buffer. The refresh happens and frame 1 is grabbed for the first time. The last third of frame 2 are drawn in the back buffer, and the first third of frame 3 is drawn in the second back buffer (hence the term triple-buffering). The refresh happens, frame 1 is grabbed for the second time, and frame 2 is copied into the frame buffer and the first part of frame 3 into the back buffer. The last 2/3 of frame 3 are drawn in the back buffer, the refresh happens, frame 2 is grabbed for the first time, and frame 3 is copied to the frame buffer. The process starts over. This time we still got 2 frames, but in only 3 refresh cycles. That&#8217;s 2/3 of the refresh rate, which is 50FPS, exactly what we would have gotten without it. Triple-buffering essentially gives the video card someplace to keep doing work while it waits to transfer the back buffer to the frame buffer, so it doesn&#8217;t have to waste time. Unfortunately, triple-buffering isn&#8217;t available in every game, and in fact it isn&#8217;t too common. It also can cost a little performance to utilize, as it requires extra VRAM for the buffers, and time spent copying all of them around. However, triplebuffered VSync really is the key to the best experience as you eliminate tearing without the downsides of normal VSync (unless you consider the fact that your FPS is capped a downside&#8230; which is silly because you can&#8217;t see an FPS higher than your refresh anyway).</p></blockquote>
<p>If the thread is ever unavailable, you can download a PDF version <a title="PDF: How VSync works, and why people loathe it" href="http://tdistler.com/wp-content/uploads/2010/07/How-VSync-works-and-why-people-loathe-it.pdf" target="_blank">HERE</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://tdistler.com/2010/07/18/video-tearing-v-sync-and-triple-buffering/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>ThreadNinja: Finding Rogue POSIX Threads</title>
		<link>http://tdistler.com/2010/07/08/threadninja-finding-rogue-posix-threads</link>
		<comments>http://tdistler.com/2010/07/08/threadninja-finding-rogue-posix-threads#comments</comments>
		<pubDate>Fri, 09 Jul 2010 02:10:04 +0000</pubDate>
		<dc:creator>Tom</dc:creator>
				<category><![CDATA[Code Monkey]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://tdistler.com/?p=277</guid>
		<description><![CDATA[What Is It?
ThreadNinja is a Linux library my team created that tracks pthread_create() and pthread_join() calls in an application. It prints a stacktrace where each thread is created and where it is joined. Any rogue (unjoined) threads are reported when the application exits. ThreadNinja is unobtrusive: it does NOT have to be compiled into the [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://tdistler.com/wp-content/uploads/2010/05/tux-ninja.jpg"><img class="alignright size-thumbnail wp-image-278" title="tux-ninja" src="http://tdistler.com/wp-content/uploads/2010/05/tux-ninja-150x150.jpg" alt="Tux Ninja" width="150" height="150" /></a><strong>What Is It?</strong></p>
<p>ThreadNinja is a Linux library my team created that tracks <code>pthread_create()</code> and <code>pthread_join()</code> calls in an application. It prints a stacktrace where each thread is created and where it is joined. Any rogue (unjoined) threads are reported when the application exits. ThreadNinja is <em>unobtrusive</em>: it does NOT have to be compiled into the code. This means you can use it on applications you didn&#8217;t compile.</p>
<p>We found it useful and thought we&#8217;d share it. It&#8217;s be no means production code&#8230; just a tool. Hack on it, expand it, change it&#8230; whatever. It&#8217;s pretty small, so it should be easy to dive right in. We&#8217;ve released it under the <a title="BSD License Page" href="http://www.opensource.org/licenses/bsd-license.php" target="_blank">BSD license</a>.</p>
<p><strong>Cut To The Chase</strong></p>
<p>You can checkout the source code from <a title="Thread Ninja Project on Google Code" href="http://code.google.com/p/threadninja/" target="_blank">Google Code</a>, or download the version 1.0 tarball directly (<a title="Thread Ninja Source Tarball" href="/media/code/threadninja.tar.gz" target="_blank">threadninja.tar.gz</a>).</p>
<p>To build ThreadNinja, simply untar it and call <code>make</code>:<br />
<code>&gt; tar -zxf threadninja.tar.gz<br />
&gt; make</code></p>
<p>Now, simply use <code>LD_PRELOAD</code> to run the application:</p>
<p><code>&gt; LD_PRELOAD=/path/to/threadninja/build/libthreadninja.so.1 TheApplication</code></p>
<p>If you don&#8217;t see function names in the stacktraces that are generated, then the application needs to be compiled with debug symbols. For my test app, I had to compile with the <code>-rdynamic</code> option:</p>
<p><code>&gt; ﻿g++ -Wall -rdynamic main.cpp -lpthread</code></p>
<p>This causes the global symbol table to be included in the executable, which contains all the application&#8217;s function names. For more info, look at the <code>--export-dynamic</code> option on the <a title="GNU Linker man page" href="http://linux.die.net/man/1/ld" target="_blank">GNU linker (ld) man page</a>.</p>
<p><strong>The Story Behind ThreadNinja</strong></p>
<p>My team was assigned to stabilize a large video application that runs as a Linux-based appliance. The application consisted of 100,000+ of lines to code that was a tangle of build warnings, circular references, and many creative hacks. Our particular task was to fix a persistent set of seg-faults and memory leaks.<span id="more-277"></span></p>
<p>One of the first things we noticed was that the application didn&#8217;t shutdown properly. The shutdown logic was something akin to: signal components to exit, sleep 2 seconds, call Release() a couple times to clean up extra ref-counts, sleep 2 seconds, and then just call exit() to get the process to terminate (bypassing the remaining clean up code). Of course, this renders <a title="Valgrind" href="http://valgrind.org/" target="_blank">Valgrind</a> useless when trying to find memory leaks, because the automatic memory cleanup code gets bypassed during the process abort.</p>
<p>As a result, our first priority was to get the app to shutdown cleanly. The first issue we ran into was that <code>pthread_join()</code> blocked indefinitely because threads were failing to terminate. We tried using GDB to track the threads, but many hundreds of threads were being created and destroyed dynamically. We needed a way to let the application run for hours, allow 1000+ threads to live and die, and still be able to track rogue threads. Hence, ThreadNinja was born.</p>
<p><strong>How It Works</strong></p>
<p>Through the magic of <code><a title="LD_PRELOAD" href="http://linux.die.net/man/8/ld.so" target="_blank">LD_PRELOAD</a></code>, ThreadNinja &#8220;injects&#8221; itself between all calls to <code>pthread_create()</code> and <code>pthread_join()</code>. This happens because <code>LD_PRELOAD</code> instructs the loader to load ThreadNinja into memory first, before other libraries (in this case, before the pthread library). The result is that ThreadNinja&#8217;s implementation of <code>pthread_create()</code> and <code>pthread_join()</code> are used by the application instead of pthread&#8217;s own implementation. What ThreadNinja does is track the calls to these methods and then pass the call on to the &#8220;real&#8221; pthread implementation. From the application&#8217;s point-of-view, the behavior of the thread methods are the same&#8230; the tracking is transparent.</p>
<p><strong>Output</strong></p>
<p>Each time <code>pthread_create()</code> is called, a stacktrace and timestamp are printed to stdout:</p>
<p><code>Thread Created: 3047947120<br />
[bt] Thu Jul  8 16:31:23 2010</code></p>
<p><code>[bt] ./a.out(StartService(unsigned long*, int))<br />
[bt] ./a.out(Initialize())<br />
[bt] ./a.out(main+0xb) [0x80489fa]<br />
[bt] /lib/libc.so.6(__libc_start_main+0xe6) [0xb5bbb6]<br />
[bt] ./a.out() [0x8048811]</code></p>
<p>The first line (&#8220;Thread Created&#8221;) gives the value of the <code>pthread_t</code> handle, so you can later track where rogue threads where created. The next line is the time when the create happened. The remaining lines are the call stack that led to the <code>pthread_create()</code> call.</p>
<p>Each time <code>pthread_join()</code> is called, similar information is printed to stdout:</p>
<p><code>Thread Joined: 3047947120<br />
[bt] Thu Jul  8 16:31:31 2010</code></p>
<p><code>[bt] ./a.out(Terminate())<br />
[bt] ./a.out(main+0x10) [0x80489ff]<br />
[bt] /lib/libc.so.6(__libc_start_main+0xe6) [0xb5bbb6]<br />
[bt] ./a.out() [0x8048811]</code></p>
<p>When the application terminates (cleanly or uncleanly), a summary of the current state of the application threads is printed:</p>
<p><code>exit_handler()<br />
[Thread Summary]<br />
Total Created: 573<br />
Total Joined: 568<br />
Total Running: 5</code></p>
<p>In this case, you&#8217;ll notice that 5 threads were never joined on.</p>
<p><strong>Limitations</strong></p>
<p>ThreadNinja only tracks calls to <code>pthread_create()</code> and <code>pthread_join(</code>). This means calls like <code>system()</code>, <code>exec()</code>, and <code>fork()</code> are not tracked. Also, calls to <code>pthread_cancel()</code> are not tracked. We had started adding code to track pthread mutexes and stuff, but it turned out we didn&#8217;t need it. Feel free to add support for all this stuff and submit changes to the Google code site.</p>
<p>Happy coding!</p>
]]></content:encoded>
			<wfw:commentRss>http://tdistler.com/2010/07/08/threadninja-finding-rogue-posix-threads/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Security Fallacy</title>
		<link>http://tdistler.com/2010/07/01/security-fallacy</link>
		<comments>http://tdistler.com/2010/07/01/security-fallacy#comments</comments>
		<pubDate>Thu, 01 Jul 2010 20:21:20 +0000</pubDate>
		<dc:creator>Tom</dc:creator>
				<category><![CDATA[Tech and Security]]></category>
		<category><![CDATA[security]]></category>
		<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false">http://tdistler.com/?p=384</guid>
		<description><![CDATA[&#8220;Cryptography can be used to secure my data. Therefore, if I use cryptography my data is secure.&#8221;
Wrong.
I think Bruce Schneier described it best (paraphrased): Cryptography is like having a really strong front door on your house&#8230; 2 foot thick steal, blast proof, the whole 9 yards. A thief isn&#8217;t going to try and break through [...]]]></description>
			<content:encoded><![CDATA[<p><em>&#8220;Cryptography can be used to secure my data. Therefore, if I use cryptography my data is secure.&#8221;</em></p>
<p>Wrong.</p>
<p>I think <a title="Bruce Schneier's Blog" href="http://www.schneier.com/" target="_blank">Bruce Schneier</a> described it best (paraphrased): Cryptography is like having a really strong front door on your house&#8230; 2 foot thick steal, blast proof, the whole 9 yards. A thief isn&#8217;t going to try and break through your front door&#8230; they&#8217;ll just climb through a window!</p>
<p>Security is about the whole system; not just the crypto. <a title="xkcd" href="http://xkcd.com" target="_blank">xkcd</a> summed it up nicely:</p>
<p><a href="http://tdistler.com/wp-content/uploads/2010/07/xkcd_security.jpg"><img class="aligncenter size-full wp-image-385" title="xkcd_security" src="http://tdistler.com/wp-content/uploads/2010/07/xkcd_security.jpg" alt="xkcd: Security" width="448" height="274" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://tdistler.com/2010/07/01/security-fallacy/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>High-performance Timing on Linux / Windows</title>
		<link>http://tdistler.com/2010/06/27/high-performance-timing-on-linux-windows</link>
		<comments>http://tdistler.com/2010/06/27/high-performance-timing-on-linux-windows#comments</comments>
		<pubDate>Mon, 28 Jun 2010 05:15:55 +0000</pubDate>
		<dc:creator>Tom</dc:creator>
				<category><![CDATA[Code Monkey]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[windows]]></category>

		<guid isPermaLink="false">http://tdistler.com/?p=350</guid>
		<description><![CDATA[High-performance timing is hard… no doubt about it. I can’t tell you how many times I’ve seen high-performance timing code done wrong. Timing is one of those things where a little knowledge can be problematic; the code may work, but it either won’t perform or will exhibit “unexplained” behavior. The purpose of this post is [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://tdistler.com/wp-content/uploads/2010/06/melting_clock.jpg"><img class="alignright size-full wp-image-352" title="melting_clock" src="http://tdistler.com/wp-content/uploads/2010/06/melting_clock.jpg" alt="Melting Clock" width="96" height="96" /></a>High-performance timing is hard… no doubt about it. I can’t tell you how many times I’ve seen high-performance timing code done wrong. Timing is one of those things where a little knowledge can be problematic; the code may work, but it either won’t perform or will exhibit “unexplained” behavior. The purpose of this post is to explain a foundational component to getting timing right: <em>the clock</em>. I won’t focus on theory… this post is meant to be pragmatic.</p>
<p><em>Note</em>: I’m talking here about interval timing (i.e. accurately measuring the duration between 2 events). This is different than synchronizing different clocks or maintaining accurate wall time.</p>
<p><strong>Anatomy of the Clock</strong></p>
<p>The first mistake most people make when doing timing is to use functions like <code><a href="http://www.opengroup.org/onlinepubs/000095399/functions/gettimeofday.html" target="_blank">gettimeofday()</a></code>, <code><a title="MSDN: GetSystemTime" href="http://msdn.microsoft.com/en-us/library/ms724390(VS.85).aspx" target="_blank">GetSystemTime()</a></code>, etc. These functions return what is called &#8220;wall time&#8221;… time that corresponds to a calendar date/time. These clocks suffer from the follow limitations:</p>
<ol>
<li>They have a low resolution: “High-performance” timing, by my definition, requires clock resolutions into the microseconds or better.</li>
<li>They can jump forwards and backwards in time: Computer clocks all tick at slightly different rates, which causes the time to drift. Most systems have NTP enabled which periodically adjusts the system clock to keep them in sync with “actual” time. The adjustment can cause the clock to suddenly jump forward (artificially inflating your timing numbers) or jump backwards (causing your timing calculations to go negative or hugely positive).</li>
</ol>
<p>For interval timing, all that’s needed for a clock is a simple counter that increments at a stable rate. For high-performance timing, the rate this counter increments should be high. A related constraint is that the counter must be monotonic (can never “tick” backwards&#8230; ever). The counter may overflow and wrap back to 0, but using unsigned math in your timing calculations can compensate for that (see example below).</p>
<p>Something to note: since we are usually measuring short durations, the drift of the clock is so small that we aren&#8217;t concerned by it (what matters is the drift between successive reads, not total drift over time).<span id="more-350"></span></p>
<p><strong>Using the Clock</strong></p>
<p>To use the clock, you need access to 2 pieces of information: the clock value, and the rate (frequency) at which it increments. Once you have that information, it’s trivial to calculate the duration between 2 successive reads of the clock.</p>
<p>For example, consider the following:</p>
<p>-        <code>start</code> equals the clock value at the beginning of the interval to measure.</p>
<p>-        <code>end</code> equals the clock value at the end of the interval.</p>
<p>-        <code>frequency</code> equals the frequency that the clock increments per second.</p>
<p>Calculating the duration of the interval (in seconds) is as simple as:</p>
<p><code>duration = (end – start) / frequency</code></p>
<p><code>duration</code> will equal the floating-point interval time in seconds (e.g. 0.000237 = 237us).</p>
<p>A common use case I’ve seen is wanting to measure the duration between successive calls to a function. Here’s one way to implement that:</p>
<p><code>float freq = (float) get_frequency();</code></p>
<p><code>Foo()<br />
{<br />
unsigned now = read_clock();<br />
float duration = (float)(now – last) / (float)freq;<br />
last = now;<br />
}</code></p>
<p>It is important to use unsigned variables for <code>now</code> and <code>last</code> or your calculations will go haywire if the clock ever wraps back to 0. Consider what happens if <code>now</code> is less than <code>last</code>. Using unsigned variables allows the math operation to underflow, which produces the correct result. If you don&#8217;t believe me, just write some test code and see for yourself&#8230; it&#8217;s an important concept to understand.</p>
<p><strong>Don’t Use RDTSC As Your Clock</strong></p>
<p>I’m dismayed by how many forum posts suggest that newbie’s use <a title="Wikipedia: RDTSC" href="http://en.wikipedia.org/wiki/RDTSC" target="_blank">RDTSC</a> for timing. Don’t get me wrong; the TSC isn’t bad in-and-of-itself. It’s just way too hard to get timing right with it unless you’re an expert… and if you’re reading this, chances are you aren’t <img src='http://tdistler.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> . Stick to the higher level API’s I present below.</p>
<p>Even if you are an expert, I still suggest avoiding RDTSC for the following reasons:</p>
<ol>
<li>It’s processor core specific: Using RDTSC will give different values on different processor cores. This causes non-monotonic clock behavior as a thread is migrated across cores during execution (i.e. the clock can tick backwards on two successive reads). Conversely, the clock may also appear to jump forwards if the thread moves to a different core.</li>
<li>It doesn’t always tick at the same rate: On some systems, the clock frequency will vary as the CPU load changes. This is due to power saving features in the processor that throttle down the clock speed when the load is low (e.g. Intel SpeedStep).</li>
<li>It’s unnecessary: there are better alternatives to RDTSC that are readily available (<a title="Wikipedia: High Precision Event Timer" href="http://en.wikipedia.org/wiki/High_Precision_Event_Timer" target="_blank">HPET</a> and <a title="Wikipedia: Advanced Programmable Interrupt Controller" href="http://en.wikipedia.org/wiki/Advanced_Programmable_Interrupt_Controller" target="_blank">APIC</a>).</li>
</ol>
<p>“<a title="Game Timing and Multicore Processors" href="http://msdn.microsoft.com/en-us/library/ee417693(VS.85).aspx" target="_blank">Game Timing and Multicore Processors</a>” has more info, though it’s centered around Windows.</p>
<p>“<a title="Guidelines For Providing Multimedia Support" href="http://www.microsoft.com/whdc/system/sysinternals/mm-timer.mspx" target="_blank">Guidelines For Providing Multimedia Support</a>” has a good summary of the hardware clocks on the PC platform, and justification for the creation of the HPET.</p>
<p>A more technical look into the past problems with RDTSC can be found in <a title="Email from Rick Brunner" href="http://lkml.org/lkml/2005/11/4/173" target="_blank">this email from Rick Brunner (AMD Fellow)</a>.</p>
<p>If you&#8217;re having trouble sleeping, you can look at <a title="Intel: HPET Design" href="http://www.intel.com/hardwaredesign/hpetspec_1.pdf" target="_blank">Intel’s HPET design document</a>.</p>
<p><strong>Linux Clock</strong></p>
<p>POSIX.1b defines realtime clock methods that you’ll find on most *NIX systems (the full spec can be viewed <a title="POSIX Specification 2008" href="http://www.opengroup.org/onlinepubs/9699919799/" target="_blank">HERE</a>). Specifically, you want to use <code>clock_getres()</code> and <code>clock_gettime()</code>. <code>clock_getres()</code> returns the resolution (frequency) of the clock, and <code>clock_gettime()</code> returns the current value of the clock. Most systems implement the <code>CLOCK_MONOTONIC</code> type, which provides a frequency-stable, monotonically-increasing counter. The resolution of <code>CLOCK_MONOTONIC</code> is high on the 2.6 kernel, in my experience. I recommend using this clock when building a high-performance timing solutions on Linux.</p>
<p>The methods are defined in <code>time.h</code>, and you need to link against librt (pass ‘-lrt’ to gcc). The prototypes for the functions are:</p>
<p><code>int clock_getres(clockid_t <em>clock_id</em>, struct timespec *<em>res</em>);</code><br />
<code>int clock_gettime(clockid_t <em>clock_id</em>, struct timespec *<em>tp</em>);</code></p>
<p>A detailed description of these methods can be found <a href="http://www.opengroup.org/onlinepubs/000095399/functions/clock_getres.html" target="_blank">HERE</a>.</p>
<p>I also suggest looking at <code><a href="http://www.opengroup.org/onlinepubs/000095399/functions/clock_nanosleep.html" target="_blank">clock_nanosleep()</a></code>, but that’s a separate topic.</p>
<p><strong>Windows Clock</strong></p>
<p>On Windows, <code><a href="http://msdn.microsoft.com/en-us/library/ms644905(VS.85).aspx" target="_blank">QueryPerformanceFrequency()</a></code> and <code><a href="http://msdn.microsoft.com/en-us/library/ms644904(v=VS.85).aspx" target="_blank">QueryPerformanceCounter()</a></code> are the obvious choice. <code>QueryPerformanceFrequency()</code> returns (surprise!) the frequency of the counter. <code>QueryPerformanceCounter()</code> returns the current value of the counter. Just like <code>CLOCK_MONOTONIC</code> on Linux, the Windows performance counter is a high-frequency, stable, monotonically-increasing counter.</p>
<p>The methods are defined in <code>windows.h</code>. The prototypes are:</p>
<p><code>BOOL QueryPerformanceFrequency(LARGE_INTEGER *lpFrequency);<br />
BOOL QueryPerformanceCounter(LARGE_INTEGER *lpPerformanceCount)</code>;</p>
<p><strong>Windows Timing Errata</strong></p>
<p><em>Note</em>: The follow items do NOT affect the resolution of the Window’s performance counter, but I think it’s still important to know when doing timing on Windows.</p>
<p>One problem you may run into on Windows is that the default system clock interval defaults to 10 or 15ms (depending of the OS version). This clock is what drives all the timers and sleep functions for that platform. What this means is that, left to the default, your timers will trigger 5 to 7.5ms late, on average.</p>
<p>You can fix this by increasing the system clock resolution to 2ms (I remember reading a tech article by Microsoft saying that 2ms gave better system performance than 1ms, but I can’t find it for the life of me). You do this by using the <code>timeGetDevCaps()</code>, <code>timeBeginPeriod()</code> and <code>timeEndPeriod()</code>. Sample code can be found <a title="Windows Multimedia Timers" href="http://msdn.microsoft.com/en-us/library/dd743626(v=VS.85).aspx" target="_blank">HERE</a>. You have to include <code>mmsystem.h</code> and link against <code>Winmm.lib</code>.</p>
<p>I feel it necessary to note that increasing the system clock interval negatively affects power consumption. The Windows 7 blog as an interesting breakdown of “<a title="Windows 7 Energy Efficiency" href="http://blogs.msdn.com/b/e7/archive/2009/01/06/windows-7-energy-efficiency.aspx" target="_blank">Windows 7 Energy Efficiency</a>”. Specifically, they noticed a 10% drop in battery life when the clock resolution was set to 1ms using <code>timeBeginPeriod()</code>.</p>
<p>Maybe this last section belongs in a separate post, but whatever&#8230; <img src='http://tdistler.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://tdistler.com/2010/06/27/high-performance-timing-on-linux-windows/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Video: &#8220;Swagger Wagon&#8221; Toyota Commercial</title>
		<link>http://tdistler.com/2010/06/26/video-swagger-wagon-toyota-commercial</link>
		<comments>http://tdistler.com/2010/06/26/video-swagger-wagon-toyota-commercial#comments</comments>
		<pubDate>Sun, 27 Jun 2010 04:50:01 +0000</pubDate>
		<dc:creator>Tom</dc:creator>
				<category><![CDATA[Oh So Random]]></category>
		<category><![CDATA[funny]]></category>
		<category><![CDATA[video]]></category>

		<guid isPermaLink="false">http://tdistler.com/?p=361</guid>
		<description><![CDATA[Being a parent now, I found this commercial pretty funny.

]]></description>
			<content:encoded><![CDATA[<p>Being a parent now, I found this commercial pretty funny.</p>
<p><object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="640" height="385" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowScriptAccess" value="always" /><param name="src" value="http://www.youtube.com/v/ql-N3F1FhW4&amp;color1=0xb1b1b1&amp;color2=0xd0d0d0&amp;hl=en_US&amp;feature=player_detailpage&amp;fs=1" /><param name="allowfullscreen" value="true" /><embed type="application/x-shockwave-flash" width="640" height="385" src="http://www.youtube.com/v/ql-N3F1FhW4&amp;color1=0xb1b1b1&amp;color2=0xd0d0d0&amp;hl=en_US&amp;feature=player_detailpage&amp;fs=1" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
]]></content:encoded>
			<wfw:commentRss>http://tdistler.com/2010/06/26/video-swagger-wagon-toyota-commercial/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Macros with Variable Argument Lists</title>
		<link>http://tdistler.com/2010/06/21/macros-with-variable-argument-lists</link>
		<comments>http://tdistler.com/2010/06/21/macros-with-variable-argument-lists#comments</comments>
		<pubDate>Mon, 21 Jun 2010 22:46:39 +0000</pubDate>
		<dc:creator>Tom</dc:creator>
				<category><![CDATA[Code Monkey]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://tdistler.com/?p=325</guid>
		<description><![CDATA[In C/C++, macros that take a variable number of arguments (called variadic macros) can be very useful. Having printf-style macros just makes certain things easier to read and understand. Below I&#8217;ll describe how to do this in a way that works on Windows and Linux, as well as supports empty argument lists.
Example
 Consider the following [...]]]></description>
			<content:encoded><![CDATA[<p>In C/C++, macros that take a variable number of arguments (called <a title="Variadic Macros" href="http://gcc.gnu.org/onlinedocs/cpp/Variadic-Macros.html" target="_blank">variadic macros</a>) can be very useful. Having printf-style macros just makes certain things easier to read and understand. Below I&#8217;ll describe how to do this in a way that works on Windows and Linux, as well as supports empty argument lists.</p>
<p><strong>Example</strong></p>
<p><strong> </strong>Consider the following log method:</p>
<p><code>void WriteLog( const LOG_LEVEL level, const char* file, const int line, const char* format, ... );</code></p>
<p>Pretty straight-forward: it allows you to specify the log level (NOTICE, WARNING, ERROR, etc), a file name and line number, and a user-defined description in &#8220;printf&#8221; format. To log a warning message, you would type:</p>
<p><code>WriteLog( LL_WARNING, __FILE__, __LINE__, "[WARNING] Failed to import descriptor '%s:%i'!", _descriptor, _id );</code></p>
<p>The entry in the log file would look like:</p>
<p><code>Core.cpp:715 [WARNING] Failed to import descriptor 'timing-engine'!</code></p>
<p><strong>Creating a Variadic Macro</strong></p>
<p>Calling <code>WriteLog</code> directly works fine, but it&#8217;s pretty verbose and annoying to use. One way to make it simpler is to wrap <code>WriteLog</code> with a variadic macro:</p>
<p><code> #define LOG_WARNING( format, ... ) \\<br />
WriteLog( LL_WARNING, __FILE__, __LINE__, "[WARNING] " format, ##__VA_ARGS__ );</code></p>
<p>Then the example above would become:</p>
<p><code>LOG_WARNING( "Failed to import descriptor '%s:%i'!", _descriptor, _id );</code></p>
<p>The macro automatically fills in the file and line number where the log message was generated, and the log level is specified by the macro name (<code>LOG_WARNING</code>). This is much more straight-forward and I believe it makes the code easier to read.</p>
<p>Notice that <code>__VA_ARGS__</code> is used to get the variable argument list and pass it to <code>WriteLog</code>. During compilation, the preprocessor replaces <code>__VA_ARGS__</code> with the comma-separated list of  arguments. The &#8216;<code>##</code>&#8216; that prefixes <code>__VA_ARGS__</code> is vital if you want your macro to work like you expect. Without it, you would be <em>required</em> to have at least one argument after the format string. &#8217;<code>##</code>&#8216; has a special meaning in this case: it causes the preceding comma to be deleted if there are no variable arguments. Without this, the following line would generate a syntax error:</p>
<p><code>LOG_WARNING( "This would cause a syntax error" );</code></p>
<p>Prefixing <code>__VA_ARGS__</code> with &#8216;<code>##</code>&#8216; allows the above code to work just fine.</p>
]]></content:encoded>
			<wfw:commentRss>http://tdistler.com/2010/06/21/macros-with-variable-argument-lists/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Stop Stealing My File Descriptors!</title>
		<link>http://tdistler.com/2010/06/18/stop-stealing-my-file-descriptors</link>
		<comments>http://tdistler.com/2010/06/18/stop-stealing-my-file-descriptors#comments</comments>
		<pubDate>Fri, 18 Jun 2010 21:48:43 +0000</pubDate>
		<dc:creator>Tom</dc:creator>
				<category><![CDATA[Code Monkey]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://tdistler.com/?p=313</guid>
		<description><![CDATA[We ran into a weird problem the other day where our Linux video display appliance would lose audio support when the process was restarted. The audio was supposed to play through a custom joystick-keyboard that was attached via USB (the keyboard is used by security guards to PTZ cameras, control monitors, etc). The audio could [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://tdistler.com/wp-content/uploads/2010/06/sherlock_tux.jpg"><img class="alignright size-thumbnail wp-image-319" title="sherlock_tux" src="http://tdistler.com/wp-content/uploads/2010/06/sherlock_tux-150x150.jpg" alt="Sherlock Tux" width="150" height="150" /></a>We ran into a weird problem the other day where our Linux video display appliance would lose audio support when the process was restarted. The audio was supposed to play through a custom joystick-keyboard that was attached via USB (the keyboard is used by security guards to PTZ cameras, control monitors, etc). The audio could be heard just fine when the box first booted, but if the application restarted audio would be lost.</p>
<p>Looking at the logs, we found that our audio pipeline was failing to open <code>/dev/dsp</code> on the restart. We then used <code>lsof</code> to list the open file descriptors to see which process currently held <code>/dev/dsp</code>:</p>
<p><code># lsof | grep /dev/dsp<br />
ntpd   18857    root   16u    CHR     14,3    180099 /dev/dsp<br />
</code></p>
<p>What!?!?&#8230; why the heck is NTP opening the sound device and how did it steal it from us??? After some discussion we started remembering a problem in the past with <code>ntpd</code> stealing our SNMP diagnostics port. This just didn&#8217;t make any sense.</p>
<p>Digging into our appliance code, we found this line:</p>
<p><code>system( "service ntpd restart" );</code></p>
<p>This would be called each time we were notified by the security system that the NTP server address had changed (which fired once each time the process was started so we could get the initial address). But this still didn&#8217;t explain why NTP took over ownership of our file descriptors on restart.</p>
<p>Long story short: <code>system()</code> is implemented as <code>fork()</code> followed by <code>execv()</code>. By default, <code>fork()</code> gives a copy of the parent&#8217;s file descriptors to the child process (i.e. the <code>ntpd</code> child process got a copy of the <code>/dev/dsp</code> file descriptor). To prevent this, you have to set the <code>FD_CLOEXEC</code> flag on the file desciptors you don&#8217;t want copied.</p>
<p>For example:</p>
<p><code>fd = open( "/dev/dsp", O_RDWR );<br />
fcntl( fd, F_SETFD, FD_CLOEXEC );</code></p>
<p><strong>Conclusion</strong>: setting the <code>FD_CLOEXEC</code> flag on the <code>/dev/dsp</code> file descriptor fixed the problem for audio. However, most of the other file desciptors still got owned by <code>ntpd</code>. Did we go back and set the <code>FD_CLOEXEC</code> flag on all file descriptors, you ask? Nope. It turns out we had a script monitoring the NTP config file and restarting ntpd for us when the file got updated&#8230; we just had to update the config file and remove the <code>system( "service ntpd restart" )</code> call.</p>
<p>Oh, and the reason audio worked on first boot but not subsequent restarts was due to a weird race condition around when <code>/dev/dsp</code> got opened.</p>
]]></content:encoded>
			<wfw:commentRss>http://tdistler.com/2010/06/18/stop-stealing-my-file-descriptors/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>IBM&#8217;s &#8220;Watson&#8221; Beats Contestants at Jeopardy</title>
		<link>http://tdistler.com/2010/06/17/ibms-watson-beats-contestants-at-jeopardy</link>
		<comments>http://tdistler.com/2010/06/17/ibms-watson-beats-contestants-at-jeopardy#comments</comments>
		<pubDate>Thu, 17 Jun 2010 19:00:06 +0000</pubDate>
		<dc:creator>Tom</dc:creator>
				<category><![CDATA[Tech and Security]]></category>
		<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false">http://tdistler.com/?p=306</guid>
		<description><![CDATA[The New York Times has a great article on a new system developed by IBM named &#8220;Watson&#8221;. It&#8217;s a computer system that&#8217;s scraped 10&#8217;s of millions of documents from the Internet and compiled a massive database of knowledge. It used natural language parsing to interpret questions and generate answers. The cool thing is that it [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://tdistler.com/wp-content/uploads/2010/06/ibm_watson_jeopardy.jpg"><img class="alignright size-thumbnail wp-image-309" title="ibm_watson_jeopardy" src="http://tdistler.com/wp-content/uploads/2010/06/ibm_watson_jeopardy-150x150.jpg" alt="IBM Watson" width="150" height="150" /></a>The New York Times has a <a title="New York Times: What Is I.B.M.’s Watson?" href="http://www.nytimes.com/2010/06/20/magazine/20Computer-t.html" target="_blank">great article</a> on a new system developed by IBM named &#8220;Watson&#8221;. It&#8217;s a computer system that&#8217;s scraped 10&#8217;s of millions of documents from the Internet and compiled a massive database of knowledge. It used natural language parsing to interpret questions and generate answers. The cool thing is that it beat former Jeopardy contestants 4 out of 6 times in mock Jeopardy session. Here are some quotes from the article I found interesting:</p>
<blockquote><p>It displayed remarkable facility with cultural trivia (“This action flick starring Roy Scheider in a high-tech police helicopter was also briefly a TV series” — “What is ‘Blue Thunder’?”), science (“The greyhound originated more than 5,000 years ago in this African country, where it was used to hunt gazelles” — “What is Egypt?”) and sophisticated wordplay (“Classic candy bar that’s a female <a title="More articles about the U.S. Supreme Court." href="http://topics.nytimes.com/top/reference/timestopics/organizations/s/supreme_court/index.html?inline=nyt-org">Supreme Court</a> justice” — “What is Baby Ruth Ginsburg?”).</p></blockquote>
<blockquote><p>Software firms and university scientists have produced question-answering systems for years, but these have mostly been limited to simply phrased questions. Nobody ever tackled “Jeopardy!” because experts assumed that even for the latest artificial intelligence, the game was simply too hard: the clues are too puzzling and allusive, and the breadth of trivia is too wide.</p>
<p>With Watson, I.B.M. claims it has cracked the problem — and aims to prove as much on national TV. <strong><em>The producers of “Jeopardy!” have agreed to pit Watson against some of the game’s best former players as early as this fall</em><span style="font-weight: normal;"> (emphasis mine)</span></strong>. To test Watson’s capabilities against actual humans, I.B.M.’s scientists began holding live matches last winter.</p></blockquote>
<p>I&#8217;d definitely watch that episode&#8230; especially if Watson was pitted against <a title="Wikipedia: Ken Jennings" href="http://en.wikipedia.org/wiki/Ken_Jennings" target="_blank">Ken Jennings</a>.</p>
<p>Under the hood:</p>
<blockquote><p>[IBM's] main breakthrough was not the design of any single, brilliant new technique for analyzing language. Indeed, many of the statistical techniques Watson employs were already well known by computer scientists. One important thing that makes Watson so different is its enormous speed and memory. Taking advantage of I.B.M.’s supercomputing heft, Ferrucci’s team input millions of documents into Watson to build up its knowledge base — including, he says, “books, reference material, any sort of dictionary, thesauri, folksonomies, taxonomies, encyclopedias, any kind of reference material you can imagine getting your hands on or licensing. Novels, bibles, plays.”</p></blockquote>
<p>The full article is worth the read.</p>
]]></content:encoded>
			<wfw:commentRss>http://tdistler.com/2010/06/17/ibms-watson-beats-contestants-at-jeopardy/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Motivational Poster: Ninjas</title>
		<link>http://tdistler.com/2010/05/29/motivational-poster-ninjas</link>
		<comments>http://tdistler.com/2010/05/29/motivational-poster-ninjas#comments</comments>
		<pubDate>Sun, 30 May 2010 05:14:37 +0000</pubDate>
		<dc:creator>Tom</dc:creator>
				<category><![CDATA[Oh So Random]]></category>
		<category><![CDATA[motivational poster]]></category>

		<guid isPermaLink="false">http://tdistler.com/?p=285</guid>
		<description><![CDATA[I&#8217;d so give this guy money.

]]></description>
			<content:encoded><![CDATA[<p>I&#8217;d so give this guy money.</p>
<p><a href="http://tdistler.com/wp-content/uploads/2010/05/ninja_bumb.jpg"><img class="aligncenter size-full wp-image-284" title="ninja_bumb" src="http://tdistler.com/wp-content/uploads/2010/05/ninja_bumb.jpg" alt="Ninjas" width="320" height="277" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://tdistler.com/2010/05/29/motivational-poster-ninjas/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
