yaws

  1. how it works?
  2. query parameters
  3. cookie sessions
  4. record #arg
  5. appmods
  6. appmods for sub-paths
  7. mnesia
  8. json
  9. redirect
  10. cgi, fcgi

how it works?

you enclose Erlang code between <erl>...</erl> tags in your .yaws html pages

when the client GETs a page that has a .yaws suffix, the yaws server will read that page from the hard disk and divide it in parts that consist of HTML code and ERLANG code

each chunk of ERLANG code will be compiled into a module. the chunk must contain a function out/1. if it doesn’t the yaws server will insert a proper error message into the generated HTML output

yaws server will invoke the out/1 function in that code and insert the output of that out/1 function into the stream of HTML that is being shipped to the client

there are several ways to make the out/1 function generate HTML output

the first is by returning a tuple {html, String} where String then is regular HTML data (possibly as a deep list of strings and/or binaries) which will simply be inserted into the output stream

an example:


  <html>
  <body>
  <erl>
    out(Args) ->
      Headers = Args#arg.headers,
      {html, f("You say that you’re running ~p", [Headers#headers.user_agent])}.
  </erl>
  </body>
  </html>

but returning HTML-formatted strings from out/1 is painful - embedded tags can get messy. yaws provides ehtml (essentially HTML in Erlang syntax) as a better alternative. it generates output by returning a tuple {ehtml, EHTML}

the term EHTML must adhere to the following structure:


    EHTML =   [ EHTML ]
            | { TAG, Attrs, Body }
            | { TAG, Attrs }
            | { TAG }
            | binary()
            | character()
      where
        TAG = atom()
        Attrs = [{atom(), Value}]
        Value = string() | atom()
        Body = EHTML

so, HTML encoded as Erlang terms

  <tag attr1=”val1” attr2=”val2”>
    child
  </tag>

is represented in ehtml as the Erlang term:

  {tag, [{attr1, “val1”}, {attr2, ”val2”}], “child”}

an example:

  <html>
  <body>
  <erl>
    out(Args) ->
      S = Args#arg.headers,
      {ehtml, [{hr}, {p, [], S#headers.user_agent}, {hr}]}.
  </erl>
  </body>
  </html>

you will struck a problem using DIV's though:

  {ehtml, {div, [], "hello"}}.
  ** 1: syntax error before: 'div' **

this is because div is an infix operator in Erlang:

similar problems will take place if you try to use namespace prefixed elements and attributes

the solution for both cases is to place these problematic symbols in single quotes:


   {ehtml, {'div', [], "hello"}}.
   {ehtml, {'b:form', [{'b:destination', "URL"}], "Form Stuff"}}.

the third way to produce answer is use some useful constructions:
{content, MimeType, Content}
this will make the yaws generate different content than HTML. this return value is only allowed in a .yaws file which has only one part and no html parts at all

{connection, What}
this sets the connection header. if What is the special value "close", the connection will be closed once the page is delivered to the client

{set_cookie, Cookie}
prepends a "Set-Cookie: " header to the list of previously set "Set-Cookie: " headers

{page, Page}
make yaws return a different page than the one being requested

{ssi, File, Delimiter, Bindings}
server side include and macro expansion. each occurrence of a string inside File which is inside Delimiters is replaced with the corresponding value in Bindings

access to query parameters

a URL can have an optional query part. this part is passed in the Args#arg.querydata which is passed as an argument to the out/1 function
  
  out(Args) ->
    Res1 = yaws_api:getvar (Args, par1),
    Res2 = yaws_api:queryvar (Args, par2),
    Res3 = yaws_api:parse_query (Args),
    Res4 = yaws_api:parse_post (Args),
    {html, io_lib:format("~p ~p ~p ~p",[Res1, Res2, Res3, Res4])}.
  

functions:

yaws_api:queryvar(Args, Key) -> term() | undefined
returns the value of the variable if it is found in the query part of the request. if the variable is not found or if the variable is unset, the function returns undefined

yaws_api:getvar(Args, Key) -> {ok, Value} 
returns the querypart of the URL is part as field in the Arg structure

yaws_api:parse_query(Args) -> [{Key, Value}] 
parses the raw data into a key/value list. it is fairly convenient way of getting data from the GET requests

  $> curl -X GET localhost:8080/index.yaws?"par1=var1&par2=var2&par3=var3"
   {ok,"var1"}<p>{ok,"var2"}<p>[{"par1","var1"},{"par2","var2"},{"par3","var3"}]<p>[]

yaws_api:parse_post(Args) -> [{Key, Value}] 
parses the raw data into a key/value list. it is fairly convenient way of getting data from the POST requests

  $> curl -X POST localhost:8080/index.yaws \
     -d 'par1=var1&par2=var2&par3=var3' \
  {ok,"var1"}<p>undefined<p>[]<p>[{"par1","var1"},
                                  {"par2","var2"},
                                  {"par3","var3"}]

cookie sessions

yaws_api:new_cookie_session (Data)     -> String
yaws_api:cookieval_to_opaque (Cookie)  -> {ok, Data} | {error,no_session}

yaws_api:replace_cookie_session (OldCookie, NewData) -> true
yaws_api:delete_cookie_session (Cookie) -> nocleanup

example:

  <erl>
  -record(myrec, {name = "", time = 0}).

  out(Arg) ->
    X = (Arg#arg.headers)#headers.cookie,
    case yaws_api:find_cookie_val("sid", X) of
      [] ->
        Cookie = yaws_api:new_cookie_session(#myrec{name="Foo"}),
        CO = yaws_api:setcookie("sid", Cookie, "/"),
        [{ehtml,[{p,[],"set cookie to "},f("~p",[Cookie])]}, CO];
      Cookie ->
        {ok, Data} = yaws_api:cookieval_to_opaque(Cookie),
        NewTime = Data#myrec.time + 1,
        NewData = Data#myrec{time = NewTime},
        yaws_api:replace_cookie_session(Cookie, NewData),
        [{html, f("~p<p>~p",[Cookie, NewData])}]
    end.
  </erl>
instead of your own record you can use predefined record:
 
  -record(setcookie, {
     key,
     value,
     quoted,
     comment,
     comment_url,
     discard,
     domain,
     max_age,
     expires,
     path,
     port,
     secure,
     version
   }).
 

record #arg

key yaws data structure is record #arg. it contains everything yaws knows about a request:
  • client socket
  • HTTP headers
  • request info
  • target URL details
  • this record defined in yaws_api.hrl file. the yaws_api.hrl file is included by default in all .yaws files

    yaws passes #arg record as single parameter to out/1

    
      -record(arg,
      {
        headers,        % record
        clisock,        % the socket leading to the peer client
        client_ip_port, % {ClientIp, ClientPort}
        req,            % {http_request, Method, {abs_path,Path}, {1,1}}
        clidata,        % binary in POST requests, undefined in GET requests
        querydata,      % string "par1=val1?par2=val2..." (GET reqs)
        appmoddata,     % the remainder of the path up to the query
        appmod_prepath, % path in front of: <appmod><appmoddata>
        pathinfo,       % the remainder of the path after .yaws
        server_path,    % path after domain:port/
        fullpath,       % full path to .yaws file
        docroot,
        cont,
        state,
        pid,
        opaque
      }).
    
    
    the headers value of this record is also a record:
    
    
      -record(headers,
      {
        connection,
        accept,
        host,
        if_modified_since,
        if_match,
        if_none_match,
        if_range,
        if_unmodified_since,
        range,
        referer,
        user_agent,
        accept_ranges,
        cookie,
        keep_alive,
        content_length,
        authorization,
        other
      }).
    
    
    
    so all these data may be obtained by your application module

    appmods

    the appmod (“application module”) is an Erlang module that:
  • starts with the http server
  • exports at least start/0 (initializer) function
  • all other exported functions may be called from any .yaws out/1 functions at any time

    appmods are specified in the yaws.conf file:

     
       #path to appmod beam files
       ebin_dir = /store/www/yaws/ebin
    
       # appmod beam files
       runmod = appmod_name_1
       runmod = appmod_name_2
       ...
     
    so appmod is some kind of library for .yaws pages

    example:

      -module(myauth).
      -export([loop/0, check/1, start/0, into/2]).
      -include("/usr/local/lib/yaws/include/yaws_api.hrl").
    
      start() ->
        spawn(?MODULE, loop, []),
        ok.
    
      loop() ->
        ets:new(mytable, [set, public, named_table]),
        receive
          _ -> ok
        end.
    
      check(Args) ->
        C = yaws_api:find_cookie_val("cookieVal",(Args#arg.headers)#headers.cookie),
        R = ets:lookup(mytable, C),
        case R of
          [] -> 0;
          [{_Cookie, Data}] -> Data
        end.
    
      into(Cookie, Data) ->
        ets:insert(mytable, {Cookie, Data}).
    

    and then:

      <erl>
        out(Args) ->
          {A, B, C} = now(),
          random:seed(A, B, C),
          D = integer_to_list(random:uniform(10000)),
          E = yaws_api:setcookie("cookieVal", D),
          myauth:into(D, 10),                          <!-- here -->
          [{html, "ok"}, E]
        end.
      </erl>
    
      <erl>
       out(Args)->
         case myauth:check(Args) of                    <!-- here -->
           [] -> {redirect_local, "/index.html"};
           _  -> {html, "ok"}
         end.
      </erl>
    

    appmods for sub-paths

    appmods may be configured for URL sub-paths. in yaws.conf file:
     
      runmod = appmod_name
      
      <server localhost>
        port = 8080
        listen = 0.0.0.0
        docroot = /usr/share/yaws
        appmods = <substr_in_uri, appmod_name>
      </server>
     

    when a request arrives for a matching URL, yaws dispatches the request to the appmod callback fun out/1 to process the rest of the URI, passing it an arg record. the appmod's out/1 function can then examine the rest of the URI to determine the precise resource that is the target of the incoming request, and respond accordingly

    a special case of an appmod that is particularly interesting is the '/' appmod. this used when we want application code to handle all requests

     
      <server myframe>
        port = 8001
        listen = 0.0.0.0
        docroot = /home/yaws/www
        appmods = </, appmod_name>
      </server >
     

    one complication with the slash appmod is that usually we have a set of folders containing static data, images and java script, and we want yaws to just deliver those files without intervention from the slash appmod. this can be achieved by excluding a set of directories:

     
      <server myframe>
        port = 8001
        listen = 0.0.0.0
        docroot = /home/yaws/www
        appmods = </, appmod_name exclude_paths pics js top/static>
      </server>
     

    this configuration will invoke the appmod_name erlang module on everything except any file found in directories pics, js and top/static relative to the docroot


    starting mnesia

      $> sudo yaws --daemon -name yaws@zog --mnesiadir ""/store/www/yaws/mnesia""
    
    another way - in interactive mode:
      $> sudo yaws -i -name yaws@zog
      Erlang R15B01 (erts-5.9.1) [source] [async-threads:0] [hipe] [kernel-poll:false]
      1> application:set_env(mnesia, dir, "/store/www/yaws/mnesia").
      ok  
      2> application:start(mnesia).
      ok  
    
    or, simplier:
      $> sudo yaws -i -name yaws@zog --mnesiadir ""/store/www/yaws/mnesia""
    
    or, you can connect to the yaws (erl) runtime and start Mnesia from there. if you started Yaws with -name and -setcookie you can connect to it like any other Erlang runtime

    or, you could insert into yaws.conf file the string:

      mnesia_dir = /var/yaws/www/mnesia
    

    module json2

    this module translates JSON types into the following Erlang types:
        JSON             Erlang
        -----------------------------------------------------
        number           number
        string           string
        array            {array, ElementList}
        object           {struct, Proplist with string keys}
        true             atom true
        false            atom false
        null             atom null
        ------------------------------------------------------
    
    JSON   :   {"sKey":"Val", "iKey":10, [1, 2, 3]} 
    Erlang :   {struct,[{sKey,<<"Val">>},{iKey,10},{array, [1,2,3]}]}
    
    the nested structures need to have the same "struct" style as the top-level object:
     
      {struct, [ {hello,"foo"}, {from,1}, {to, {a,"bar"}} ]}
     
    should instead be:
     
      {struct, [ {hello,"foo"}, {from,1}, {to, {struct, [ {a, "aa"} ]}} ]}
     
    note: Erlang's floats are of fixed precision and limited range, so syntactically valid JSON floating-point numbers could silently lose precision or noisily cause an overflow
    json2:encode({array,Obj})}.
    encoding
      out(Args) ->
        ...
        Obj = {struct,[
          {"item","Item1"},
          {"price",20.2},
          {"contract",51},
          {"qty",3} ]
        },
        {html, json2:encode({array,Obj})}.
    
    json2:decode_string(JSON_String)
    parsing
      out(Args) ->
        [{DataString,_}] = yaws_api:parse_post(Args),
        {ok, {struct,DataObj}} = json2:decode_string(DataString),
        ...
    
    object keys may be atoms or strings on encoding but are always decoded as strings

    redirect

    redirect in config

    section
      <redirect>
        ...
      </redirect>
    
    in yaws.config file defines a redirect mapping. the following items are allowed:
      Path = URL
    
    or
      Path = file
    
    all accesses to Path will be redirected to URL/Path or alternatively to scheme:host:port/file/Path if a file is used. note that the original path is appended to the redirected url. so, asumming this config resides on a site called http://abc.com, if we for example have:
      <redirect>
        /foo = http://www.mysite.org/zapp
        /bar = /tomato.html
      </redirect>
    
    we have the following redirects:
      http://abc.com/foo       -> http://www.mysite.org/zapp/foo
      http://abc.com/foo/test  -> http://www.mysite.org/zapp/foo/test
      http://abc.com/bar       -> http://abc.com/tomato.html
    
    when we specify a file as target for the redirect, the redir will be to the current http server

    sometimes we do not want to have the original path appended to the redirected path. to get that behaviour we specify the config with ’==’ instead of ’=’:

      <redirect>
        /foo == http://www.mysite.org/zapp
      </redirect>
    
    now a request for http://abc.com/foo/test simply gets redirected to http://www.mysite.org/zapp. this is typically used when we simply want a static redirect at some place in the docroot

    redirect in script

      <!--  redirect.yaws  file -->
      <erl>
        out(_Arg) ->
          {redirect, "http://www.google.com"}.
      </erl>
    
    the code above redirects to an external URL. The HTTP RFC mandates that the Loction header must contain complete URLs, including the the method, http, https etc.

    a very common case of redirection, is a to redirect to another file on the same server.

      <!--  redirect2.yaws  file -->
      <erl>
        out(_Arg) ->
          {redirect_local, "/redirect.yaws"}.
      <erl>
    
    this code will do a relative redirect to the code in redirect.yaws which in its turn redirects, once again, to google

    cgi, fcgi

    .cgi files are usual, with the first-bang-line as interpreter path

    with perl:

    #!/usr/bin/perl
    
    use CGI;
    
    $q = new CGI; 
    print $q->header,                                      # create the HTTP header
          $q->start_html('hello perl'),                    # start the HTML
          $q->h1('hello world from yaws to perl'),         # level 1 header
          $q->end_html;     
    
    exit 0;
    

    with python:

    #!/usr/bin/python2
    
    import cgi
    
    print "content-type: text/html\n\n";
    print "hello from yaws to python!<p>"
    print "<br><br>"
    

    you can use erlang as well, but in that case without bang-string:

    <html>
    <h1> hello world !</h1>
    
    <erl>
      out(Arg) -> {ehtml, [{h2, [{class, "greeting"}],  "hello from yaws to yaws"}]}.
    </erl>
    
    </html>
    

    cgi in config

    <server localhost>
            port = 80
            listen = 0.0.0.0
            docroot = /home/www
            allowed_scripts = yaws cgi 
    </server>
    

    fcgi source

    #include "fcgi_stdio.h"  // fcgi library; put it first  
    #include <stdio.h>
    
    int main (int argc, char *argv[]) {
      int count = 0;
    
      while (FCGI_Accept() > 0) {
        printf ("content-type:text/html\n\nFastCGI<p>your ticket is %i\n", count++);
      }
    }
    
    /*     gcc hello_fcgi.c -lfcgi -o a.out
     *     sudo spawn-fcgi a.out -a 127.0.0.1 -p 23456        */
    

    fcgi in config

    note: for fcgi scripts, the FastCGI application server is only called if a local file with the .fcgi extension exists. however, the contents of the local .fcgi file are ignored

    <server localhost>
            port = 80
            listen = 0.0.0.0
            docroot = /home/www
            allowed_scripts = yaws fcgi
            fcgi_app_server = localhost:23456
    </server>
    
    put the empty file a.fcgi in the root directory and in browser address bar type:
      http://your.server.addr/a.fcgi

    or in shell:

    $> curl localhost/a.fcgi
    FastCGI<p>your ticket is 0
    $> curl localhost/a.fcgi
    FastCGI<p>your ticket is 1
    $> curl localhost/a.fcgi
    FastCGI<p>your ticket is 2
    $> curl localhost/a.fcgi
    FastCGI<p>your ticket is 3
    $> curl localhost/a.fcgi
    FastCGI<p>your ticket is 4
    $> curl localhost/a.fcgi
    FastCGI<p>your ticket is 5
    $> curl localhost/a.fcgi
    FastCGI<p>your ticket is 6
    $> curl localhost/a.fcgi
    FastCGI<p>your ticket is 7
    $>