<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Robert Horvick &#187; ets</title>
	<atom:link href="http://www.roberthorvick.com/tag/ets/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.roberthorvick.com</link>
	<description>Things my wife doesn&#039;t want on the family blog...</description>
	<lastBuildDate>Sat, 08 May 2010 23:27:04 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>The last word frequency post &#8211; from dict to ets</title>
		<link>http://www.roberthorvick.com/2009/07/03/the-last-word-frequency-post-from-dict-to-ets/</link>
		<comments>http://www.roberthorvick.com/2009/07/03/the-last-word-frequency-post-from-dict-to-ets/#comments</comments>
		<pubDate>Sat, 04 Jul 2009 03:15:56 +0000</pubDate>
		<dc:creator>robert</dc:creator>
				<category><![CDATA[Erlang]]></category>
		<category><![CDATA[ets]]></category>

		<guid isPermaLink="false">http://www.roberthorvick.com/?p=57</guid>
		<description><![CDATA[One last iteration through my learning exercise of building a word frequency list.  In this last post I&#8217;m moving away from a dict and to an ets table.  I was pleasantly surprised how easy the conversion was.  For example printing the output was just converting from dict:fold to ets:foldl.  The one [...]]]></description>
			<content:encoded><![CDATA[<p>One last iteration through my learning exercise of building a word frequency list.  In this last post I&#8217;m moving away from a dict and to an ets table.  I was pleasantly surprised how easy the conversion was.  For example printing the output was just converting from dict:fold to ets:foldl.  The one parity fail was that dict:update can take an initial value when the key is missing but ets:update_counter (nor any other ets function) has this benefit.  This required that I write a little wrapper function to call from the list:foldl (instead of having a multi-line inlined fun).</p>
<p>No point in getting too deep into this &#8211; here&#8217;s the code:</p>

<div class="wp_syntax"><div class="code"><pre class="erlang" style="font-family:monospace;"><span style="color: #014ea4;">-</span><span style="color: #5400b3;">module</span><span style="color: #109ab8;">&#40;</span>wordets<span style="color: #109ab8;">&#41;</span><span style="color: #6bb810;">.</span>
&nbsp;
<span style="color: #014ea4;">-</span><span style="color: #5400b3;">export</span><span style="color: #109ab8;">&#40;</span><span style="color: #109ab8;">&#91;</span>print_word_counts<span style="color: #014ea4;">/</span><span style="color: #ff9600;">1</span><span style="color: #109ab8;">&#93;</span><span style="color: #109ab8;">&#41;</span><span style="color: #6bb810;">.</span>
&nbsp;
<span style="color: #ff3c00;">words</span><span style="color: #109ab8;">&#40;</span><span style="color: #45b3e6;">String</span><span style="color: #109ab8;">&#41;</span> <span style="color: #6bb810;">-&gt;</span>
  <span style="color: #109ab8;">&#123;</span>match<span style="color: #6bb810;">,</span> <span style="color: #45b3e6;">Captures</span><span style="color: #109ab8;">&#125;</span> <span style="color: #014ea4;">=</span> <span style="color: #ff4e18;">re</span>:<span style="color: #ff3c00;">run</span><span style="color: #109ab8;">&#40;</span><span style="color: #45b3e6;">String</span><span style="color: #6bb810;">,</span> <span style="color: #ff7800;">&quot;<span style="color: #000099; font-weight: bold;">\\</span>b<span style="color: #000099; font-weight: bold;">\\</span>w+<span style="color: #000099; font-weight: bold;">\\</span>b&quot;</span><span style="color: #6bb810;">,</span> <span style="color: #109ab8;">&#91;</span>global<span style="color: #6bb810;">,</span><span style="color: #109ab8;">&#123;</span>capture<span style="color: #6bb810;">,</span>first<span style="color: #6bb810;">,</span><span style="color: #fa6fff;">list</span><span style="color: #109ab8;">&#125;</span><span style="color: #109ab8;">&#93;</span><span style="color: #109ab8;">&#41;</span><span style="color: #6bb810;">,</span>
  <span style="color: #ff4e18;">lists</span>:<span style="color: #ff3c00;">append</span><span style="color: #109ab8;">&#40;</span><span style="color: #45b3e6;">Captures</span><span style="color: #109ab8;">&#41;</span><span style="color: #6bb810;">.</span>
&nbsp;
<span style="color: #666666; font-style: italic;">%% reads the next line from the file.  If there is data then...</span>
<span style="color: #666666; font-style: italic;">%% split the data into a list of words and add to the word table</span>
<span style="color: #ff3c00;">process_each_line</span><span style="color: #109ab8;">&#40;</span><span style="color: #45b3e6;">IoDevice</span><span style="color: #6bb810;">,</span> <span style="color: #45b3e6;">Table</span><span style="color: #109ab8;">&#41;</span> <span style="color: #6bb810;">-&gt;</span>
  <span style="color: #186895;">case</span> <span style="color: #ff4e18;">io</span>:<span style="color: #ff3c00;">get_line</span><span style="color: #109ab8;">&#40;</span><span style="color: #45b3e6;">IoDevice</span><span style="color: #6bb810;">,</span> <span style="color: #ff7800;">&quot;&quot;</span><span style="color: #109ab8;">&#41;</span> <span style="color: #186895;">of</span>
    eof <span style="color: #6bb810;">-&gt;</span> 
      <span style="color: #ff4e18;">file</span>:<span style="color: #ff3c00;">close</span><span style="color: #109ab8;">&#40;</span><span style="color: #45b3e6;">IoDevice</span><span style="color: #109ab8;">&#41;</span><span style="color: #6bb810;">,</span>
      <span style="color: #45b3e6;">Table</span><span style="color: #6bb810;">;</span>
    <span style="color: #109ab8;">&#123;</span>error<span style="color: #6bb810;">,</span> <span style="color: #45b3e6;">Reason</span><span style="color: #109ab8;">&#125;</span> <span style="color: #6bb810;">-&gt;</span>
      <span style="color: #ff4e18;">file</span>:<span style="color: #ff3c00;">close</span><span style="color: #109ab8;">&#40;</span><span style="color: #45b3e6;">IoDevice</span><span style="color: #109ab8;">&#41;</span><span style="color: #6bb810;">,</span>
      <span style="color: #ff3c00;">throw</span><span style="color: #109ab8;">&#40;</span><span style="color: #45b3e6;">Reason</span><span style="color: #109ab8;">&#41;</span><span style="color: #6bb810;">;</span>
    <span style="color: #45b3e6;">Data</span> <span style="color: #6bb810;">-&gt;</span>
      <span style="color: #45b3e6;">NewTable</span> <span style="color: #014ea4;">=</span> <span style="color: #ff4e18;">lists</span>:<span style="color: #ff3c00;">foldl</span><span style="color: #109ab8;">&#40;</span>
        <span style="color: #ff3c00;">fun</span><span style="color: #109ab8;">&#40;</span><span style="color: #45b3e6;">W</span><span style="color: #6bb810;">,</span> <span style="color: #45b3e6;">T</span><span style="color: #109ab8;">&#41;</span> <span style="color: #6bb810;">-&gt;</span> <span style="color: #ff3c00;">update_word_count</span><span style="color: #109ab8;">&#40;</span><span style="color: #45b3e6;">W</span><span style="color: #6bb810;">,</span> <span style="color: #45b3e6;">T</span><span style="color: #109ab8;">&#41;</span> <span style="color: #186895;">end</span><span style="color: #6bb810;">,</span> <span style="color: #45b3e6;">Table</span><span style="color: #6bb810;">,</span> <span style="color: #ff3c00;">words</span><span style="color: #109ab8;">&#40;</span><span style="color: #45b3e6;">Data</span><span style="color: #109ab8;">&#41;</span><span style="color: #109ab8;">&#41;</span><span style="color: #6bb810;">,</span>
      <span style="color: #ff3c00;">process_each_line</span><span style="color: #109ab8;">&#40;</span><span style="color: #45b3e6;">IoDevice</span><span style="color: #6bb810;">,</span> <span style="color: #45b3e6;">NewTable</span><span style="color: #109ab8;">&#41;</span>
  <span style="color: #186895;">end</span><span style="color: #6bb810;">.</span>
&nbsp;
<span style="color: #ff3c00;">update_word_count</span><span style="color: #109ab8;">&#40;</span><span style="color: #45b3e6;">Word</span><span style="color: #6bb810;">,</span> <span style="color: #45b3e6;">Table</span><span style="color: #109ab8;">&#41;</span> <span style="color: #6bb810;">-&gt;</span>
  <span style="color: #186895;">case</span> <span style="color: #ff4e18;">ets</span>:<span style="color: #ff3c00;">lookup</span><span style="color: #109ab8;">&#40;</span><span style="color: #45b3e6;">Table</span><span style="color: #6bb810;">,</span> <span style="color: #45b3e6;">Word</span><span style="color: #109ab8;">&#41;</span> <span style="color: #186895;">of</span>
    <span style="color: #109ab8;">&#91;</span><span style="color: #109ab8;">&#123;</span><span style="color: #45b3e6;">Word</span><span style="color: #6bb810;">,</span> <span style="color: #45b3e6;">_</span><span style="color: #109ab8;">&#125;</span><span style="color: #109ab8;">&#93;</span> <span style="color: #6bb810;">-&gt;</span>
      <span style="color: #ff4e18;">ets</span>:<span style="color: #ff3c00;">update_counter</span><span style="color: #109ab8;">&#40;</span><span style="color: #45b3e6;">Table</span><span style="color: #6bb810;">,</span> <span style="color: #45b3e6;">Word</span><span style="color: #6bb810;">,</span> <span style="color: #ff9600;">1</span><span style="color: #109ab8;">&#41;</span><span style="color: #6bb810;">;</span> 
    <span style="color: #109ab8;">&#91;</span><span style="color: #109ab8;">&#93;</span> <span style="color: #6bb810;">-&gt;</span>
      <span style="color: #ff4e18;">ets</span>:<span style="color: #ff3c00;">insert</span><span style="color: #109ab8;">&#40;</span><span style="color: #45b3e6;">Table</span><span style="color: #6bb810;">,</span> <span style="color: #109ab8;">&#123;</span><span style="color: #45b3e6;">Word</span><span style="color: #6bb810;">,</span> <span style="color: #ff9600;">1</span><span style="color: #109ab8;">&#125;</span><span style="color: #109ab8;">&#41;</span>
  <span style="color: #186895;">end</span><span style="color: #6bb810;">,</span>
  <span style="color: #45b3e6;">Table</span><span style="color: #6bb810;">.</span>
&nbsp;
<span style="color: #ff3c00;">print_words</span><span style="color: #109ab8;">&#40;</span><span style="color: #45b3e6;">Words</span><span style="color: #109ab8;">&#41;</span> <span style="color: #6bb810;">-&gt;</span>
  <span style="color: #ff4e18;">ets</span>:<span style="color: #ff3c00;">foldl</span><span style="color: #109ab8;">&#40;</span><span style="color: #ff3c00;">fun</span><span style="color: #109ab8;">&#40;</span><span style="color: #109ab8;">&#123;</span><span style="color: #45b3e6;">W</span><span style="color: #6bb810;">,</span><span style="color: #45b3e6;">C</span><span style="color: #109ab8;">&#125;</span><span style="color: #6bb810;">,</span> <span style="color: #45b3e6;">AccIn</span><span style="color: #109ab8;">&#41;</span> <span style="color: #6bb810;">-&gt;</span> 
    <span style="color: #ff4e18;">io</span>:<span style="color: #ff3c00;">format</span><span style="color: #109ab8;">&#40;</span><span style="color: #ff7800;">&quot;~s: ~w~n&quot;</span><span style="color: #6bb810;">,</span> <span style="color: #109ab8;">&#91;</span><span style="color: #45b3e6;">W</span><span style="color: #6bb810;">,</span> <span style="color: #45b3e6;">C</span><span style="color: #109ab8;">&#93;</span><span style="color: #109ab8;">&#41;</span><span style="color: #6bb810;">,</span> <span style="color: #45b3e6;">AccIn</span> <span style="color: #186895;">end</span><span style="color: #6bb810;">,</span> void<span style="color: #6bb810;">,</span> <span style="color: #45b3e6;">Words</span><span style="color: #109ab8;">&#41;</span><span style="color: #6bb810;">.</span>
&nbsp;
<span style="color: #666666; font-style: italic;">%% opens the indicated file, processes the contents and prints</span>
<span style="color: #666666; font-style: italic;">%% out the word/count pairs to stdout</span>
<span style="color: #ff3c00;">print_word_counts</span><span style="color: #109ab8;">&#40;</span><span style="color: #45b3e6;">Filename</span><span style="color: #109ab8;">&#41;</span> <span style="color: #6bb810;">-&gt;</span>
  <span style="color: #109ab8;">&#123;</span>ok<span style="color: #6bb810;">,</span> <span style="color: #45b3e6;">IoDevice</span><span style="color: #109ab8;">&#125;</span> <span style="color: #014ea4;">=</span> <span style="color: #ff4e18;">file</span>:<span style="color: #ff3c00;">open</span><span style="color: #109ab8;">&#40;</span><span style="color: #45b3e6;">Filename</span><span style="color: #6bb810;">,</span> read<span style="color: #109ab8;">&#41;</span><span style="color: #6bb810;">,</span>
  <span style="color: #45b3e6;">Words</span> <span style="color: #014ea4;">=</span> <span style="color: #ff3c00;">process_each_line</span><span style="color: #109ab8;">&#40;</span><span style="color: #45b3e6;">IoDevice</span><span style="color: #6bb810;">,</span> <span style="color: #ff4e18;">ets</span>:<span style="color: #ff3c00;">new</span><span style="color: #109ab8;">&#40;</span>words<span style="color: #6bb810;">,</span> <span style="color: #109ab8;">&#91;</span><span style="color: #109ab8;">&#93;</span><span style="color: #109ab8;">&#41;</span><span style="color: #109ab8;">&#41;</span><span style="color: #6bb810;">,</span>
  <span style="color: #ff3c00;">print_words</span><span style="color: #109ab8;">&#40;</span><span style="color: #45b3e6;">Words</span><span style="color: #109ab8;">&#41;</span><span style="color: #6bb810;">.</span></pre></div></div>

<p>The ets implementation feels a bit forced (which it was &#8211; the point was to learn another module).  I don&#8217;t think I&#8217;d have gone this way in practice unless I wanted to persist the frequency data to a file or if the word data were more complex (for example if I were storing information about where in the file the word was, word neighbors, etc).</p>
<p>Enough of this sample.  On to something more substantial.</p>



Share and Enjoy:


	<a rel="nofollow" id="print" href="javascript:window.location='http%3A%2F%2Fwww.printfriendly.com%2Fprint%3Furl%3Dhttp%253A%252F%252Fwww.roberthorvick.com%252F2009%252F07%252F03%252Fthe-last-word-frequency-post-from-dict-to-ets%252F%26amp%3Bpartner%3Dsociable';" title="Print this article!"><img src="http://www.roberthorvick.com/wp-content/plugins/sociable/images/printfriendly.png" title="Print this article!" alt="Print this article!" class="sociable-hovers" /></a>
	<a rel="nofollow" id="twitter" href="javascript:window.location='http%3A%2F%2Ftwitter.com%2Fhome%3Fstatus%3DThe%2520last%2520word%2520frequency%2520post%2520-%2520from%2520dict%2520to%2520ets%2520-%2520http%253A%252F%252Fwww.roberthorvick.com%252F2009%252F07%252F03%252Fthe-last-word-frequency-post-from-dict-to-ets%252F';" title="Twitter"><img src="http://www.roberthorvick.com/wp-content/plugins/sociable/images/twitter.png" title="Twitter" alt="Twitter" class="sociable-hovers" /></a>
	<a rel="nofollow" id="digg" href="javascript:window.location='http%3A%2F%2Fdigg.com%2Fsubmit%3Fphase%3D2%26amp%3Burl%3Dhttp%253A%252F%252Fwww.roberthorvick.com%252F2009%252F07%252F03%252Fthe-last-word-frequency-post-from-dict-to-ets%252F%26amp%3Btitle%3DThe%2520last%2520word%2520frequency%2520post%2520-%2520from%2520dict%2520to%2520ets%26amp%3Bbodytext%3DOne%2520last%2520iteration%2520through%2520my%2520learning%2520exercise%2520of%2520building%2520a%2520word%2520frequency%2520list.%2520%2520In%2520this%2520last%2520post%2520I%2527m%2520moving%2520away%2520from%2520a%2520dict%2520and%2520to%2520an%2520ets%2520table.%2520%2520I%2520was%2520pleasantly%2520surprised%2520how%2520easy%2520the%2520conversion%2520was.%2520%2520For%2520example%2520printing%2520the%2520output%2520was%2520just%2520';" title="Digg"><img src="http://www.roberthorvick.com/wp-content/plugins/sociable/images/digg.png" title="Digg" alt="Digg" class="sociable-hovers" /></a>
	<a rel="nofollow" id="reddit" href="javascript:window.location='http%3A%2F%2Freddit.com%2Fsubmit%3Furl%3Dhttp%253A%252F%252Fwww.roberthorvick.com%252F2009%252F07%252F03%252Fthe-last-word-frequency-post-from-dict-to-ets%252F%26amp%3Btitle%3DThe%2520last%2520word%2520frequency%2520post%2520-%2520from%2520dict%2520to%2520ets';" title="Reddit"><img src="http://www.roberthorvick.com/wp-content/plugins/sociable/images/reddit.png" title="Reddit" alt="Reddit" class="sociable-hovers" /></a>
	<a rel="nofollow" id="del.icio.us" href="javascript:window.location='http%3A%2F%2Fdelicious.com%2Fpost%3Furl%3Dhttp%253A%252F%252Fwww.roberthorvick.com%252F2009%252F07%252F03%252Fthe-last-word-frequency-post-from-dict-to-ets%252F%26amp%3Btitle%3DThe%2520last%2520word%2520frequency%2520post%2520-%2520from%2520dict%2520to%2520ets%26amp%3Bnotes%3DOne%2520last%2520iteration%2520through%2520my%2520learning%2520exercise%2520of%2520building%2520a%2520word%2520frequency%2520list.%2520%2520In%2520this%2520last%2520post%2520I%2527m%2520moving%2520away%2520from%2520a%2520dict%2520and%2520to%2520an%2520ets%2520table.%2520%2520I%2520was%2520pleasantly%2520surprised%2520how%2520easy%2520the%2520conversion%2520was.%2520%2520For%2520example%2520printing%2520the%2520output%2520was%2520just%2520';" title="del.icio.us"><img src="http://www.roberthorvick.com/wp-content/plugins/sociable/images/delicious.png" title="del.icio.us" alt="del.icio.us" class="sociable-hovers" /></a>
	<a rel="nofollow" id="facebook" href="javascript:window.location='http%3A%2F%2Fwww.facebook.com%2Fshare.php%3Fu%3Dhttp%253A%252F%252Fwww.roberthorvick.com%252F2009%252F07%252F03%252Fthe-last-word-frequency-post-from-dict-to-ets%252F%26amp%3Bt%3DThe%2520last%2520word%2520frequency%2520post%2520-%2520from%2520dict%2520to%2520ets';" title="Facebook"><img src="http://www.roberthorvick.com/wp-content/plugins/sociable/images/facebook.png" title="Facebook" alt="Facebook" class="sociable-hovers" /></a>
	<a rel="nofollow" id="google" href="javascript:window.location='http%3A%2F%2Fwww.google.com%2Fbookmarks%2Fmark%3Fop%3Dedit%26amp%3Bbkmk%3Dhttp%253A%252F%252Fwww.roberthorvick.com%252F2009%252F07%252F03%252Fthe-last-word-frequency-post-from-dict-to-ets%252F%26amp%3Btitle%3DThe%2520last%2520word%2520frequency%2520post%2520-%2520from%2520dict%2520to%2520ets%26amp%3Bannotation%3DOne%2520last%2520iteration%2520through%2520my%2520learning%2520exercise%2520of%2520building%2520a%2520word%2520frequency%2520list.%2520%2520In%2520this%2520last%2520post%2520I%2527m%2520moving%2520away%2520from%2520a%2520dict%2520and%2520to%2520an%2520ets%2520table.%2520%2520I%2520was%2520pleasantly%2520surprised%2520how%2520easy%2520the%2520conversion%2520was.%2520%2520For%2520example%2520printing%2520the%2520output%2520was%2520just%2520';" title="Google Bookmarks"><img src="http://www.roberthorvick.com/wp-content/plugins/sociable/images/googlebookmark.png" title="Google Bookmarks" alt="Google Bookmarks" class="sociable-hovers" /></a>
	<a rel="nofollow" id="stumbleupon" href="javascript:window.location='http%3A%2F%2Fwww.stumbleupon.com%2Fsubmit%3Furl%3Dhttp%253A%252F%252Fwww.roberthorvick.com%252F2009%252F07%252F03%252Fthe-last-word-frequency-post-from-dict-to-ets%252F%26amp%3Btitle%3DThe%2520last%2520word%2520frequency%2520post%2520-%2520from%2520dict%2520to%2520ets';" title="StumbleUpon"><img src="http://www.roberthorvick.com/wp-content/plugins/sociable/images/stumbleupon.png" title="StumbleUpon" alt="StumbleUpon" class="sociable-hovers" /></a>


<br/><br/>]]></content:encoded>
			<wfw:commentRss>http://www.roberthorvick.com/2009/07/03/the-last-word-frequency-post-from-dict-to-ets/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
