Tagged: Software Development Toggle Comment Threads | Keyboard Shortcuts

  • kmitov 6:17 pm on March 31, 2020 Permalink |
    Tags: algorithms, , Software Development, tree   

    Everyday Trees in Computer Science – [Everyday Code] 

    The goal of this article was to equip developers with everything they need to know about trees to use Instruction Steps Framework (IS) and Instruction Converter (IC) that we are currently working on in Axlessoft. While developing the article for internal use I realized that these are things that every developer should have under their sleeve to successfully use tree structures in their day to day tasks.

    image of a tree that will make the article more interesting. Not a real programming tree

    Some of us might be tempted to revise the heavy Knuth books when it comes to solving the next tree problem. But generally we don’t need to do it. You know the meme:

    • First year Computer Science – “algorithms, data structures, etc …”
    • Second year – “functional programming, object oriented, etc…”
    • Last year – “cryptography, network programming, design of languages, etc
    • First day at work – “move this button to the left”

    So let’s start with a few things you need to know day to day.

    You will see “parent->children” more often than “node->left,right”, eg. binary trees.

    If you have a structure of parent and children this is a tree structure. You have three scenarios:

    1. The parent knows about the children and the children know about the parent. There is a parent.children and child.parent
    2. The parent knows about the children, but the children do not know about the parent. There is parent.children, but there is no child.parent
    3. The parent does not know about the children, but the children know about the parent. There is no parent.children, but there is child.parent

    In many books, algorithms and generally articles you will have people talking about Binary Trees. Now, it might be because of my line of work, but:

    1. I can not remember the last time that I had to properly work with a binary tree – something more than “just find this node”.
    2. For every time I have to work with “binary trees” there are at least 1024 times I have to work with “parent->children” structures.

    Yes, it is the same. But it is also not the same.

    Take away:

    All the three cases represent a tree. It is best to have case 1, but this is not always possible. You will deal with “parent->children” more often than with “left node”, “right node”.

    If there is a library for it, use a library.

    This should be a no-brainer, but a lot of times we like to code a few iterations here and there and develop things for basic iteration over a tree. Just don’t. If there is something tree related that you can not find a library to do for you in about 5 minutes internet search, then you are doing something worth of a Phd or you are searching with the wrong queries (or you are trying to do something stupid!).

    There are of course exceptions. Because when we, as great developers, implement Pre and Post order iteration we are doing it better than anyone else. So this exception is acceptable. Of course.

    Iteration -Pre, Post, In and visiting a node

    Visiting a node means that you are processing it. For example printing to the screen.

    There are a few basic ways. Just search for them on the internet. There are a lot of really nice resources of how you iterate over tress. The basic ways are:

    1. LRN – Left Right Node – first you visit the left child, then you visit the right child, then you visit the node. But as we say this is the theory. In a real life scenario with “parent->children” you visit children[0]..children[n] and then the parent
    2. LNR – Left Node Right – How do you do this for “parent->children” I don’t know.
    3. NLR – Node Left Right – first the parent then children[0]..children[n]
    4. NRL – Node Right Left – first the parent then children[n]..children[0]
    5. RLN – Right Left Node – first children[n]..children[0] then the parent.

    Avoid recursion

    Yes, you have trees. But avoid recursion. Use “traversers”. Almost any reasonably useful library has a “traverser” support. The idea is simple – somebody else is doing the recursion for you and this somebody else is the library.

    In pseudo code:

    rootNode = ... 
    rootNode.traverse((node)=> {
       print node

    The idea is simple. You have a rootNode. It has a method ‘traverse’ that will traverse the whole tree, iterate, recurse, do a lot of magic, but at the end it will traverse the whole tree and will call the “print node” for every node in the tree.

    There is no recursion or iteration in your code. There is just traverse. Sometimes it is called “visitor” ( Visitors pattern, Gang of four, things like this). I kind of preferred “visitor” in early stage of my software development career, but now I more often call them “traversers”. Traversers are visitors, but not all visitors are traversers from my point of view. Would be happy to talk about this over a beer. Anyone..?

    Take away:

    If you are using a library that does not have traverse you should seriously consider using another library.

    Avoid ‘getDescendants()’ and getDeepChildren()

    Yes. These are some libraries that try to help. Imagine that you nave:

    node 1
      node 1.1
        node 1.1.1
      node 1.2
        node 1.2.1
        node 1.2.2
      node 1.3

    You need to get all the children of node 1 and all of their children and you need this in an array. Why? Because reasons. This is the feature request. There are libraries like BABYLON.js that try to help. They offer a “getDescendants()’ method. You can call:

    anArray = node.getDescendants();
    // here you have all the elements of the tree in anArray
    // loop over the array
    anArray.forEach((a)=> {
       print a

    This method saves a lot. You do not need a “traverser”. You don’t need to know anything about order. You need all the elements of the whole tree so get an array. But there are drawbacks.

    1. It could be a large tree. So you are creating a copy, but this time in an array and memory and user experience are expensive.
    2. Even if it is not large you are still creating an array. Do you really need an array or you could just traverse the whole tree and do your job without previously creating an array?
    3. If you have to call getDescendants() more then once, you should refactor your code.

    Consider the following:

    anArray = node.getDescendants();
    // here you have all the elements of the tree in anArray
    // loop over the array
    anArray.forEach((a)=> {
       if(a.name.includes("1")) {
          print a.getDescendants().length

    In the example above we would like to print the number of descendants for each node that includes the string “1” in its name. How many times is the tree traversed? How many arrays are created? The answer is “a lot”. You traverse once for the rootNode and then for every other node again and you create new arrays.

    Take away:

    If there is a helper method it is probably giving you, the developer, something, but it is at the expense of the user experience – more memory, slower execution. Consider the implications of using helper methods. They might help you, but are they helping the user using your software.

  • kmitov 6:48 am on August 12, 2019 Permalink |
    Tags: , , Software Development,   

    ‘if-else’ is Forbidden – why and how to reduce logic branching in your code. 

    Each time there is an if-else logic in your code there is some logic branching. A state is checked and based on the result from the check different path are taken. This means that two scenarios are implemented and two scenarios should be support from now on. This could quickly become a problem because the number of supported cases grows exponentially.

    The more decisions you make in the code the more cases you must support.

    At the end you start from Process1 but you have 4 different cases and path that you should support. These are 4 different tests that should be written and these are for different places where things could go wrong.

    The four different paths are Process 1,2,4; 1,2,5; 1,3,6; 1,3,7

    Demonstration of a live example and a real production bug

    We have Subscriptions in the platform. Each subscription has a max_users field that shows how many users you could add to this subscription. If max_users=1 this means you can have only one user using the subscription.

    When adding people to the subscription you have two cases

    1. You successfully add someone to the subscription and you show a message “Success”
    2. The max users is reached and you show a message “Error”

    The code in a simplified manner looks something like this:

    if subscription.save 
       form.show_message "Success"
       form.show_message "Error"

    While developing we’ve changed the code for the error from form.show_message “Error” to modal_dialog.show_message “Error”

    After that we’ve changed the implementation further and the code for modal_dialog.show_message “Error” was no longed called.

    As a result when the use tries to add someone to his subscription, for which they’ve payed, the app freezes and nothing happens. There is no error displayed and no user is added to the subscription.

    The bug occurred because with the latest changes we’ve forgot to manually check the second case of the if-else clause and there is was not test developed for this case.

    How to remove the if-else clause from this code

    message = subscription.get_save_message
    form.show_message message

    The subscription.save knows what message to set based on whether the subscription was successfully saved. It could set Error or it could set Success. After that we just show whatever message the subscription has to the form. If it is empty that’s great. This means no errors have occurred. subscriptions.get_save_message could be implemented in many different ways. It could be on the subscription object or another object, but this depends on the technology and framework used. But after the method save is called the message is set and there is a single flow and now branches in our code. The method form.show_message is called a single time on a single place in our code. If we change the API of this method we would change in a single place and will not forget about the second place. There is always a single scenario. Save->Show message.

  • kmitov 7:06 am on January 22, 2019 Permalink |
    Tags: , , Software Development   

    Implementation of Multi-levels caching. Example for Rails 

    There are two difficult things in Computer Science. Choosing a name for a variable and cache invalidation.

    That being said I went on a journey to implement multi-levels caching in one of our platforms. Cache should be fast. Fast cache is expensive. If you can use 1M of very fast and expensive cache, why not implement a second level cache that is 10M, not that fast and not that expensive and 100M of normal cache that is cheap, but still faster than going to db.


    I decided to implement a DbCache that will store cached html rendered values directly in db and will access them from the DB instead of the fast Redis/Memcachier cache. All in the name of saving a few dollars on expensive fast cache that I do not really need.

    <% cache(cache_key) do # this is the redis cache %>
      <%= db_cache(cache_key) do # this is the db cache %>
        <% # actual calculation of value %>
      <% end %>
    <% end %>


    There is no need to constantly render the html/json of the content that we would like to serve to the client. It could be rendered once and served until updated. We are using Memcachier for a very fast access to cached values, but it is costing us more money. And in many cases we do not need this fast access.

    That’s why there is a DbCache implementation

    It works in the following way. It has a key and a value.

    When using in the view you can use

     <% cache(cache_key) do %>
       <%= db_cache(cache_key) do %>
       <% end %>
     <% end %>

    In this way if something is in cache we take it. This is L1 cache if you like. If it is not in L1 cache (that is stored in memory) than we ask db_cache. db_cache is our second level cache – L2. If the value is not in db_cache then we render it. The principle could be applied for L3 cache, although we are not there yet as a platform to need L3 cache.

    But it is db_cache. It is accessing the db. Why do you even call it cache?

    When the db_cache is accessed we make a single query to the db and retrieve a single record. For an indexed, not very large table this is fast. If the value is to be rendered again it will mean making a few more request for different objects and their associations and moving through all the trouble of rendering it again which involves views. By retrieving the HTML/JSON directly from DB we could directly serve it.

    How is db_cache implemented?

    DbCache model that stores the values in the db. It has a key and value columns. That’s it. Creating/retrieving/updating DbCache records is what is interesting.

    The key is column in the DB that is an integer. NOT a string. This integer is generated with a hash function and is than shifted right. The postgres db has a signed int column and the hash is generating an unsigned int. We have to shift because there is not enough space for storing unsigned int in a postgres db. In this way the cache key given from the developer is transformed to an internal key that is used for finding the record. And there of course is an index on this hash64 column.

    def db_cache key, &block
      # This will give us a 64 bit hash
      # Shift the value to reduce it because the column is signed and there is no room for
      # an unsigned value
      internal_key = Digest::XXH64.hexdigest(key).to_i(16) >> 1
      db_cache = DbCache.find_by(hash64: internal_key)

    How are keys expired?

    If a key has expired we must create a new record for this key. Expiring keys could be difficult. Every record has an updated_at value. Every day a job on the db is run and if the updated_at value is more than specific days old it is automatically expired. This controls the number of records in the DB. I am kind of interested in storing only records that are regularly accessed. I think that if a page was not accessed in a couple of days, you generally do not need a cached value for it.

    This opens the next question:

    How are keys marked not to expire? If we change the accessed_at for a record on every read that will be slow because of a write to accessed_at

    True. It is important to expire old keys, but it is also important not to touch records on every read request because this will be very slow. If we make a touch on every request to the cache this will involve an update that will slow down the method. So an update is happening only once a day. See the db_cache.touch call below. The record could be accessed thousands of times today but there will be only one write to update the updated_at value. To touch the record.

    def db_cache key, &block
      internal_key = Digest::XXH64.hexdigest(key).to_i(16) >> 1
      db_cache = DbCache.find_by(hash64: internal_key)
      if db_cache.nil?
        # create cache value
        db_cache.touch if db_cache.updated_at < Time.now - 1.day

    How fast is DbCache?

    These are just referenced values and of course this depends on the size of your cached values and the db and the load. In our specific case on Heroku we’ve found that the DbCache generally retrieves values in the range of 2 to 8 ms. In comparison the very fast Memcachier does this in the range of 2 to 8 ms.

    We also used NewRelic to look at the performance that end users are experiencing. And there was a large improvement because we could cache hundreds of MB of records in DB compared to the few MB for Memcachier that we are paying for.

    Rails specific details

    Since this code has to live in our platform and it is also bound to use some other rails object there are a few things more that I’ve done. Here is the full code that I hope gives a complete picture.

    # Author::    Kiril Mitov  
    # Copyright:: Copyright (c) 2018 Robopartans Group
    # License::   MIT
    module DbCacheHelper
      def db_cache key, &block
        # puts "db_cache: key: #{key}"
        result = nil
        if controller.respond_to?(:perform_caching) && controller.perform_caching
          # This will give us a 64 bit hash
          # Shift the value to reduce it because the column is signed and there is now room for
          # un unsigned value
          internal_key = Digest::XXH64.hexdigest(key).to_i(16) >> 1
          db_cache = DbCache.find_by(hash64: internal_key)
          if db_cache.nil?
            # puts "DBCache Miss: #{key}, #{internal_key}"
            Rails.logger.info "DBCache Miss: #{key}, #{internal_key}"
            content = capture(&block)
            # Use a rescue. This will make sure that if
            # a race condition occurs between the check for
            # existence of the db_cache and the actuall write
            # we will still be able to find the key.
            # This happens when two or more people access the site at exactly the
            # same time.
              # puts "DBCache: Trying to create"
              # puts "DBCache Count before find or create: #{DbCache.count}"
              db_cache = DbCache.find_or_create_by(hash64: internal_key)
              # puts "DBCache Count after find or create: #{DbCache.count}"
              # puts "DBCache: Found or record is with id:#{db_cache.id}"
            rescue ActiveRecord::RecordNotUnique
            # The update is after the create because the value should not be part of the
            # create.
            db_cache.update(value: content)
            # puts "DBCache Hit: #{key}, #{internal_key}"
            Rails.logger.info "DBCache Hit: #{key}, #{internal_key}"
            db_cache.touch if db_cache.updated_at < Time.now - 1.day
          result = db_cache.value
          result = capture(&block)
        # Result could be nil if we've cached nil. So just dont return nil,
        # but return empty string
        result ? result.html_safe : ""

  • kmitov 2:02 pm on January 9, 2019 Permalink |
    Tags: , parallel, , , Software Development   

    i18n locales and the pass of rspec parallel specs and 

    First rule of good unit testing is: each test should be independent of the other tests.

    But if there is a global variable like I18n.locale than one spec could touch it and another spec will be run in a different locale from the default.


    Before each suite of specs set the locale to the default. This ensures that the specs are run against the same locale each time. Specific code is:

    # spec/rails_helper.rb
    RSpec.configure do |config|
      config.before :each do
        I18n.locale = Globalize.locale = I18n.default_locale

    i18n breaks spec isolation

    Internationalization, or i18n, should be part of most platforms. This means that i18n features should be properly tests. In suites when one of the specs modifies i18n the specs that are run after that are dependent on this new local.

    This seem trivial, but we only go to explore it the moment we started running specs in parallel on different CPU cores. The specs were started and run in different times and order on each run.

  • kmitov 7:21 am on January 8, 2019 Permalink |
    Tags: , chromedriver, feature tests, google-chrome, , , Software Development,   

    Chromedriver not filling all the whole password field in automated RSpec, Capybara, Feature Tests 

    This is such a lovely story. You will like it.

    When using google chromedriver to run capybara tests sometimes, just sometimes, especially if the tests are run in parallel, when the test has to fill a text field like a password, it fills only part of it. Last time checked for chromedriver 2.45

    TL; DR;

    Solution – google does not care, or it seems it is too difficult for the chromedriver team to resolve so there simply is no solution.

    What is the test?

    We are using Rails, Rspec, Capybara, Google chromedriver. We are developing feature tests. Tests are run in parallel with

    rake parallel:spec

    Here is the test in question. Simply fill a password on the form for upgrading a subscription, click on Confirm and expect to be redirected to a page that says – “You’ve upgraded your subscription”

    def submit_and_expect_success password
          # dialog opens to confirm with password
          fill_in "owner_password", with: password
          click_on "Confirm"
          expect_redirect_to "/subscriptions/#{subscription.to_param}"
          # If it redirects to this page, it means that the change was successful
          expect(page).to have_current_path "/subscriptions/#{subscription.to_param}"

    And the tests are failing with timeout at expect_redirect_to. No, expect_redirect_to is a custom method, because we are using ActionCable to wait for subscription upgrade to finish. Because of the payment service at the back this sometimes takes a while and we want to show a nice progress and we need a websocket. But that being said the method is nothing special.

    module ExpectRedirect
      def expect_redirect_to url
        # If the path doesn't change before the timeout passes,
        # the test will fail, because there will be no redirect
        puts "expect url: #{url}"
          Timeout.timeout(Capybara.default_max_wait_time) do
            sleep(0.1) until url == URI(page.current_url).path
        rescue Timeout::Error=>e
          puts "Current url is still: #{page.current_url}"
          puts page.body
          raise e

    If we are redirected to the url withing Capybara.default_max_wait_time than everything is fine. If not, we are raising the Timeout::Error.

    Parallel execution

    For some reason the test in question fails only when we are doing a parallel execution. Or at least mostly when we are doing parallel execution of the tests. So we moved through some nice articles to revise our understanding of Timeout again and again.


    But nevertheless the tests were failing with Timeout::Error on waiting for a redirect and in the html we could see the error returned by the server:

    <div><p>Invalid password</p></div>

    How come the password is Invalid

    No this took a while to debug and it seems rather mysterious but this is what we got:

    User password in DB is: 10124ddeaf1a69e3748e308508d916b6

    The server receives from the html form: 10124ddeaf1a69e3748e30850

    User password in DB is: 74c2a3e926420e1a30363423f121fc1e

    The server receives from the html from: 74c2a3e926420e1a3

    and so on and so on.

    Sometimes the difference is 8 symbols. Sometimes it is 2. Sometimes it is 16.

    It seems to be a client side issue

    Like. JavaScript. If there is an error this strange it has to be in the JavaScript. Right. There in the javascript we see:

    let form_data = form.serializeObject();
    this.perform('start_change', form_data);

    The form gets serialized. Probably it is not serialized correctly. Probably the values that we are sending are just not the values on the form. So I revised my knowledge on serializing objects in JavaScript with


    So far so good. But the serialization was not the problem. Here is what I did. I fixed all the passwords to be 32 symbols.

    let form_data = form.serializeObject();
     if(form_data["owner_password"].lenght != 32) {
            form_data["owner_password"] = "this was:" + form_data["owner_password"] + " which is less than 32 symbols"
    this.perform('start_change', form_data);

    It it happened. The value of the password field was simply not 32 symbols long. It was not filled during the test.

    A little bit of search and we arrive at:


    and there in the bottom of the issue, there is the standard: “Not our problem resolution” with the link to:


    It seems that google chromedriver is not filling all the characters in the password field. It is doing it on random and is completely unpredictable.

    Issue still exists on:

    Issue still exist for
    Chrome: Version 71.0.3578.98 (Official Build) (64-bit)
    Chromedriver chromedriver --version
    ChromeDriver 2.45.615279 (12b89733300bd268cff3b78fc76cb8f3a7cc44e5)
    Linux kireto-laptop3 4.4.0-141-generic #167-Ubuntu x86_64 GNU/Linux
    Description:	Ubuntu 16.04.5 LTS
    Release:	16.04
    Codename:	xenial

    Today we would try Firefox driver.

  • kmitov 6:00 pm on January 3, 2019 Permalink |
    Tags: http, Software Development   

    The curious case of …http status code 505 

    Today I learned that there was such a thing as http status code 505. Well, I’ve been in software development for quite some time, but 505 was a new for me.

    TL; DR;

    There was a space in the URL and this results in an error 505


    The curious case started with an error when we were checking that some links on the platform are valid. This is what happened:

    Message: "Error: The link https://www.fllcasts.com/competitions/ returned status code 505! 
    Error: The link https://www.fllcasts.com/competitions/ returned status code 505! 
    Error: The link http://www.fllcasts.com/courses/19-moving-straight-with-lego-mindstorms-ev3-robots returned status code 505! 
    Error: The link http://www.fllcasts.com/courses/6-box-robot-two-fewer-parts-and-one-motor-simplifying-a-robot returned status code 505! "

    Strange. A quick curl revealed that the url was correct and curl was returning a correct result. wget also showed that it is working.

    It took me about one hour. One hour looking and the problem was the following:

    When the HTML document was constructed it had the following content

    <a href="{{ competitions_link }} "><img..

    Note the space. The space after }} and before “. The space right here <a href=”{{ competitions_link }}HERE”>. This took an hour today. And the solution is just to strip.


    If interested here is the test for this problem:

    it "strips url and then checks them to avoid error 505" do
         link = ' https://www.fllcasts.com '
    stub_request(:head, "https://www.fllcasts.com").to_return(status: 201)
    success, message = UrlChecker::check_link link, true
    expect(success).to be_truthy
    expect(message).to be_empty

  • kmitov 10:02 am on December 31, 2018 Permalink |
    Tags: Software Development, , Trello   

    How to plan with Trello? Part 1 – backlog and sprint board 

    I recently shared this with a friend that is constantly getting lost with Trello and how exactly to structure his software project plan. I shared my experience with him and he kind of liked it so here is my story and the few rules that are keeping me sane for the past 2 years of following them.

    Main issues with planing a software project with Trello is to decide

    • are different features in different board,
    • why do you need lables. Are different features marked with labels
    • are different features in lists?
    • how do you set the priority for a task. Do you have a list for priority, or label for priority.

    Because of these questions for the last 4-5-6 years I’ve started and stopped using Trello many times.

    These are all difficult questions. Here are my simple solutions.


    Create two boards. Backlog and SprintXX. In the SprintXX you have three lists. XX is the number of the sprint. “SPXX Planned“, “Ongoing“, “Done SPXX December 01- December 15“. When the sprint that is two – three weeks finishes you archive “Done SPXX December 01- December 15” and create a new “Done SPXX+1 December 16-December 31” list. Then you rename list “SPXX Planned” to “SPXX+1 Planned” where XX is the number of the Sprint.

    This keeps the Trello clear.

    Create two boards

    Board one is the Sprint board
    Board two is the Backlog board

    If you are currently not working on the task and there is little to no chance to work on it in the next 3-4 weeks that it is in the Backlog. This means it will be handled later.

    Sprint Board

    The Sprint board has the name of the current Sprint. I like sprints that are 2-3 weeks long. It has three lists

    SPXX Planed

    The list has all the tasks that are planned for the current sprint or probably the next one. These are tasks that you are genuinely planning to do something about.


    These are all the tasks that we are currently working on. If we have even a single line of code for this task than we are working on it.

    Done SPXX December 01 – December 15

    These are all the tasks completed in the spring XX. Note that the list has the name “Done SPXX December 01 – December 15”. This is the full name of the sprint.

    At the end of the sprint

    When the spring ends you archive “Done SPXX December 01 -December 15”. You do not archive the tasks. You archive the whole list. This gives you a chance to get back to the list at the regular reviews that you are having with the team and actually review what has happened in this sprint.

compose new post
next post/next comment
previous post/previous comment
show/hide comments
go to top
go to login
show/hide help
shift + esc