
One of the exciting new technologies that was announced atPDC 2010 is the support forasynchronous programming in C#. So what exactly isasynchronous programming? Many applications today need to connect to some service or external data source including, for example, Web Services and REST APIs. This calls may take a long time to run, but we want to run them without blocking the application.
Currently, you can run the operation on a background thread or using aTask, but coordinating multiple such operations is difficult. What if you for example need to wait until any (or all) of downloads complete and run some more code then? This is not only difficult, but it also scales badly, because blocking .NET threads is a bad practice (Threads are expensive and when they're just waiting for other operation to complete, we're wasting valuable resources). This problem has been the main motivation for includingasynchronous workflows in F# about 3 years ago. In F#, this also enabled various interesting programming styles - for examplecreating GUI using asynchronous workflows (also discussed in Chapter 16 ofmy book and in inmy recent talk). The C# asynchronous programming support and theawait keyword is largely inspired by F#asynchronous workflows (I was quite surprised that F# wasn't more visibly mentioned in the PDC talk).
In this article series, I'll demonstrate both F# and C# asynchronous programming model, I'll look at features that are missing in one or the other as well as a few subtle differences (that may be unexpected) and finally, I'll write about writing (asynchronous) F# libraries that are easy to use from C#. The plan for the series is following:
Let's start with a brief overview of asynchronous programming features in C# and F#...
Perhaps the best example to demonstrate asynchronous programming on .NET is implementing a method to download the content of a web page. In practice, this can be done easily usingWebClient, but it works well as a demo. In the implementation, we need to get HTTP response and then read data from the stream repeatedly in a loop into a buffer.
Writing the code usingBeginGetResponse andBeginRead methods is quite difficult. We can pass lambda function as an argument, but there isn't an easy way of implementing the looping. The problem is that the iteration is started in one scope, but continues in a different scope (the body of the lambda). As a result, we cannot use any standard control-flow constructs such aswhile,for but alsotry ...catch.
If the feature makes it into C# 5.0 (which I would expect) then we'll be able to write the same thing more easily using theawait construct. The following snippet shows the initialization of the HTTP request - we'll discuss the actual downloading in the next snippet.
async Task<Tuple<string, int>> DownloadPage(string url) { // Create web request to download a web site var request = HttpWebRequest.Create(url); // Asynchronously obtain the HTTP response using (var response = await request.GetResponseAsync()) { // Start reading the response stream using (var stream = response.GetResponseStream()) { // TODO: Asynchronously copy content of the stream // TODO: Read 'html' content and extract 'title' return Tuple.Create(title, html.Length); } }}The method creates a web request and then obtains a HTTP response from the server. It uses the usualusing statement to ensure that all created objects will be disposed. At the first look, it looks like a completely ordinary C# method, but it is not. There are two important differences:
async keyword andTask return type - the return type of the method isTask<Tuple<string, int>>, but there is no task created anywhere in the method. It just creates a tuple and returns it using thereturn statement. This would look like an error, but the method is also marked withasync keyword.await keyword - when getting the response in the secondusing statement, we use the new keywordawait and we call an asynchronous variant of the method namedGetResponseAsync.The two changes above turn a usual method into anasynchronous method. When called, the asynchronous method starts running, but returns a task before it completes. The caller can then do some other work before waiting for the result of the call. The waiting is done using the newawait construct. The construct can appear almost anywhere in an expression and we can use it to obtain the result of aTask<T> or to wait untilTask completes.
There is an important difference between writingtask.Result andawait task. In the first case, theResult property simply blocks the thread until the task completes and produces result. On the other hand, theawait construct is handled in a special way by the C# compiler. The compiler translates the method into a state machine and replaces all uses ofawait with code that sets acontinuation of a task and returns. A continuation is simply a function that will be called when the task completes and will continue running the body of the method. This requires quite a bit of reorganization in the body of the method and it is not something that can be easily done by hand. The key point is that usingawait will not block any thread.
Let's now look at the second snippet. It shows a loop that asynchronously downloads the page content using 1kB buffer. When it completes, it extracts title of the page and returns it together with the length of the page:
// Regular expression for extracting page titleRegex regTitle = new Regex(@@"\<title\>([^\<]+)\</title\>");// Create a buffer and stream to copy data tovar buffer = new byte[1024];var temp = new MemoryStream();int count;do { // Asynchronously read next 1kB of data count = await stream.ReadAsync(buffer, 0, buffer.Length); // Write data to memory stream temp.Write(buffer, 0, count);} while (count > 0);// Read data as string and find page titletemp.Seek(0, SeekOrigin.Begin);var html = new StreamReader(temp).ReadToEnd();var title = regTitle.Match(html).Groups[1].Value;return Tuple.Create(title, html.Length);Again, this looks like a completely normal C# code with a loop inside. The interesting thing is that we useawait inside the loop to implement asynchronous waiting. We start by initializing the buffer and a memory stream where we copy the downloaded data. Inside the loop, we start reading from the HTTP stream into a 1kB buffer. This may take some time, so we use asynchronous version of the operationReadAsync, which returnsTask<int>.
When the reading completes, the system calls a callback that we provided and the callback runs a continuation generated by the C# compiler. The continuation runs the rest of the loop body, checks the condition and then jumps either to the start of the loop (again) or to the code after the loop.
Let's now look how we can run the method for a couple of URLs in parallel and collect all the results. The C# language doesn't provide any directlanguage support for doing that, but we can easily do that using a library. The library that is distributed with the C# preview contains a classTaskEx (I guess it would be merged with standardTask class in a new .NET). The task has a very useful method namedWhenAll:
async static Task ComparePages() { var urls = new[] { "http://www.google.com", "http://www.bing.com", "http://www.yahoo.com", }; var downloads = urls.Select(DownloadPage); var results = await TaskEx.WhenAll(downloads); foreach (var item in results) Console.WriteLine("{0} (length {1})", item.Item1, item.Item2);}static void Main() { ComparePages().Wait();}TheWhenAll method (corresponding toAsync.Parallel in F#) takes a collection ofTask<T> objects and creates a single task that waits until all the tasks given as an argument complete and returns the results in an array. The return type of the method isTask<T[]>. In the code snippet above, we just create a list of URLs and use theSelect method to callDownloadPage for every URL. The result is an array of tasks that are downloading the specified web pages.
Next, we use theawait keyword again to wait until the aggregating task completes (meaning that all downloads have completed). As a result, we get an array that contains tuples returned byDownloadPage, so we can simply iterate over the values in the array and print the title of the page with the total size in bytes.
The methodComparePages is again implemented as an asynchronous method (meaning that it is marked withasync and returnsTask). It doesn't return any result, but it runs in the background and eventually completes and prints the results. SinceMain cannot be asynchronous, we cannot call it usingawait, but we can use the standardWait method of theTask type to block the program until the operation finishes. It is worth noting that when we callComparePages, the operation already starts downloading the web pages, so if your application needs to do something else, you can just call the method and discard the returnedTask object.
You may be wondering why we need asynchronous programming in the previous example at all. Why can't we just create three threads and run every operation on a separate thread. If we did that, an operation would occupy the thread for the entire time needed to download the page. On the other hand, with asynchronous support, a thread is needed only to run the processing code. An operation called usingawait is handled by the system and it will resume the code (and start using some thread) only after it completes. This means that when using asynchronous model, the above code could easily run just on a single thread, even when downloading larger number of pages such as 50.
Let's now look at the same problem implemented in F#. Asynchronous programming support is not built directly into the F# language. It is implemented usingcomputation expressions, which is a general purpose feature for writing computations with somenon-standard aspect. In F#, the same language feature can be used for sequence expressions (similar to iterators andyield return) andasynchronous workflows.
The .NET methods used and the control flow of the F# version is exactly the same. The main difference is that I'm using a recursive function instead ofwhile to implement the looping, but that's just a personal preference. I wanted to show an idiomatic functional solution (instead of an imperativewhile loop), but you can use thewhile loop in F# just fine.
1:letregTitle=newRegex(@@"\<title\>([^\<]+)\</title\>") 2: 3:///Asynchronouslydownloadsawebpageandreturns 4:///titleofthepageandsizeinbytes 5:letdownloadPage(url:string)=async { 6:letrequest=HttpWebRequest.Create(url) 7://Asynchronouslygetresponseanddisposeitwhenwe'redone 8:use!response=request.AsyncGetResponse() 9:usestream=response.GetResponseStream()10:lettemp=newMemoryStream()11:letbuffer=Array.zeroCreate409612:13://Loopthatdownloadspageintoabuffer(coulduse'while'14://butrecursionismoretypicalforfunctionallanguage)15:letrecdownload()=async {16:let!count=stream.AsyncRead(buffer,0,buffer.Length)17:do!temp.AsyncWrite(buffer,0,count)18:ifcount>0thenreturn!download() }19:20://Startthedownloadasynchronouslyandhandleresults21:do!download()22:temp.Seek(0L,SeekOrigin.Begin)|>ignore23:lethtml= (newStreamReader(temp)).ReadToEnd()24:returnregTitle.Match(html).Groups.[1].Value,html.Length }F# Web Snippets
It is not difficult to see how this corresponds to the C# version we've seen before. The body of the function is wrapped in anasync { ... } block. Unlike in C#, this is not a hard-coded language feature -async is not a keyword but a value defined in F# library that specifies what is special about the code block (in our case, the fact that it is asynchronous). The block constructs a value of typeAsync<string * int>, which represents an asynchronous computation that can be started and eventually returns a tuple.
The keywords withbang correspond to various uses of theawait keyword in C#. We're usinguse! keyword to start an asynchronous operation and dispose of the returned object when the workflow completes, we're usinglet! to start an operation and bind the result to a value and finally, we're usingdo! to start operation that doesn't return any result. I'll discuss some of the differences between F# and C# later (and in more details in an upcoming article). First, let's look how to start the computation...
Just like in the C# version, we'll first write a functioncomparePages that downloads multiple pages in parallel and then prints all the results and returns. We'll write it as anasynchronous workflow representing the composed computation. Note that unlike in C#, the computation is not started yet - we're just building a specification of what should be done. Once we have the workflow, we can start it. In the following example we start it and wait until it completes:
1:///Downloadspagesinparallelandprintsallresults 2:letcomparePages=async { 3:let!results= 4: [|"http://www.google.com";"http://www.bing.com" 5:"http://www.yahoo.com" |] 6:|>Array.mapdownloadPage 7:|>Async.Parallel 8:fortitle,lengthinresultsdo 9:Console.WriteLine("{0}(length{1})",title,length) }10:11://Startthecomputationoncurrentthread12:docomparePages|>Async.RunSynchronouslyF# Web Snippets
The code is again very similar to the C# version earlier in the article. We create an array of URLs, useArray.map to turn it into an array of asynchronous computations and then compose the computations into a single one usingAsync.Parallel. This creates a composed computation that returns an array with all the results. To get the results we wait until the composed computation completes usinglet! Once that happens, we iterate over the results and output them to the screen.
Unlike in C# (where asynchronous method returns a started task), the F# version just gives us a specification of the work to be done. We can start it using various different methods. The method used in the example above isAsync.RunSynchronously, which starts the workflow and blocks until it completes. However, the sub-tasks done by the workflow (3 distinct downloads) will be all started in parallel.
Now that we've seen the same problem implemented in both F# and C#, let's look at some of the similarities and differences between the two programming models. I'll write a separate article with more details and discussing some of the subtle and tricky differences, but I'd like to mention at least a few of them in this introduction.
Let's start with the similarities. Obviously, the syntax is quite similar. Theawait keyword corresponds to several keywords from F# computation expressions depending on how it is used. The most common islet! and then alsouse! anddo! These keywords are used to insert a special asynchronous call into code that is otherwise sequential.
There are also some very similar concepts in the libraries, in particular, for introducing parallelism. There are two ways of adding parallelism using asynchronous programming model. One is to compose multiple operations returning the same type (and running the same code) and the other one is to start and wait for tasks explicitly. For the F# way of doing this, see section 4.1 in The F# Asynchronous Programming Model [1] article.
TaskEx.WhenAll method in C#. This method corresponds toAsync.Parallel in F#. These two methods create a task or a workflow that runs multiple operations, waits until all of them complete and then returns all results.Task.Create), which is started automatically and then waiting usingawait. In F#, the same thing is done usingAsync.StartChild (to start a workflow) andlet! to wait for the completion.Even though the programming modellooks almost the same, there are some notable differences. I'll write a more detailed article about some of them, but here is an assorted list of those that I found quite important:
Async<T> which is just a specification of what should be done. We can start it in various different ways (or we may not start it) and it can be quite nicely composed. The C# programming model works withTask<T> type, which means thatasync method creates a computation that is already running in background. The F# programming model definitely feels more suitable for functional (declarative) programming languages. I also think that it makes it easier to reason about what is going on.CancellationToken object around and check whether a cancellation was requested. In F#, this is done automatically (and checks are made each time some asynchronous operation is executed (or when a language construct with special meaning is executed). Implementing the cancellation explicitly isn't as bad as writing everything usingBeginFoo, but doing it properly is not as easy as it may sound (more about that later)await anywhere inside an expression, while in F#,let! cannot be nested. When you want to rewrite something likeawait Foo(await Bar()) in F#, you first have to assign the result ofBar to some local variable. This is a good thing for C#, because it makes writing of certain things easier. A slight disadvantage may be that it is not as easy to see when an asynchronous operation occurs in the expression (though it would be interesting to see this feature in F# too).As far as I can tell, the asynchronous programming model in C# is largely based on F#asynchronous workflows. It is wonderful to see how ideas from the F# language spread to the main-stream (although it would be nicer if this was more visibly acknowledged in the presentation).
This definitely doesn't mean that F# is not interesting anymore. There are many useful features in F# not related to asynchronous programming that would be difficult to imagine in C#. Even when talking just about asynchronous programming, F# still has some features and libraries that make it more attractive. Two examples are implicit cancellation and a wonderful library for agent-based parallelism. Moreover, there are interesting new things in F# comming out at PDC 2010. There are also interesting research projects (likemy pattern matching extension [3]) which would make F# even better for reactive, concurrent and asynchronous programming (but, just to be clear, that's a distant future).
Published: Friday, 29 October 2010, 4:34 AM
Author: Tomas Petricek
Typos:Send me a pull request!
Tags:c#,asynchronous,f#