Tagged: Software Development Toggle Comment Threads | Keyboard Shortcuts

  • kmitov 9:00 am on October 5, 2021 Permalink |
    Tags: Software Development   

    A day of the life of a CTO – “what do you do, day to day?” 

    My brother asked me the other day:

    He: – So what do you do, day to day?

    Me: – I work in engineering (software, data and AI).

    It’s a little bit more than that. Not all the days are the same. There are a lot of decisions to be made, and generally with a little luck those decisions will keep the ship at least in the right direction.

    I decided to look deeper and to record. There is a difference between what we think we are doing and what we are actually doing. I tried to summarize just one of my recent days that I spent in engineering. This was a day without any software development for me.

    My hope with this article is to be able to answer my brother – “What do you do, day to day?” and I hope this answer and examples could be interesting to people entering the world of Software engineering and to business and product people trying to learn more about how their engineers spend their day.

    Adding a JSONB column to a scheme

    A colleague was facing the issue of storing an array of values in a DB. The values were the result of calling the API of an external service for our business.

    How do you store these values? There are many different ways. I supported his recommendation to store the data as JSON format. I only suggested changing the type of the column to JSONB as this will later allow us to query the table in an easier way. At the same time I had to re-think part of the stack to see if there would be any implications on the whole platform when this new column JSONB is introduced. Luckily there were no implications.

    Automated test that we create a table

    A colleague was working on a pipeline and the specs for this pipeline. The question was how do we build a spec for part of the logic. The logic creates a new table. How do we test this in an automated way? How do we test that this logic creates a db? We considered two different approaches and together we looked for a good API call to test this.

    We decided on how to spec this behavior.

    DateTime field

    A colleague was facing the issue with date in the data platform that had invalid values. The issue was that we were storing both the date and the time for an event where we should have been storing only the time. The implications were huge. We now had to migrate all the values in all the records.

    In this case we looked at the product requirements on what should be included in the data platform in the near future. Turns out that there is a requirement for engineering to store not only the time, but also the date of the event. This means there was nothing to fix as we were ahead of time in engineering. We only have to migrate a couple of records.

    The decision here was whether we should spend a day migrating this data and what would be the issue if the data was not migrated.

    Choose a service $50/80GB

    A colleague had the task to look at different services that we could use. We had to make a decision should we use this $50 service or that $50 service. The decision is important because once you decide on a service to add in your stack it is difficult to move out of this service. You kind of stay with them at least for the near future.

    Sometimes when you look at two services on your own you can overlook a few of the aspects so it is a good practice to have a second look from someone else. Also at the end it is a team decision of what to include in the stack.

    Integration with an external API

    A colleague was working on integrating with an external API. The issue was the this API is returning different formats for different calls. The question was how do we handle this. Should we hard code the scheme for this API, should we infer it, should we do something smarter? How does this impact the abstraction for the other Data Sources. We had to get on a call with the external API representatives to discuss if they could help us.

    Creating new repos

    A colleague was working on new features in the platform. These new feature should be extracted into new repositories. We had to decide on the name of the repositories. In the world of software development there are two hard things – invalidating cache and naming. Naming is important because it gives you power over things. Once you name them you have power over them. If you name them bad, then they have power over you. Nevertheless, we had to make a decision on how we name two new code repositories.

    Abilities

    A colleague was working on the authorization part of the platform. We are adding new authorizations based on roles. He developed the code and was ready for a Code review. I was there and decided to jump on the Code Review. The issue with the implementation was that it was coupling the authorization with all the modules in a single class. Coupling is bad in the long run as it is not very agile and difficult to maintain. We spent time decoupling the implementation.

    System vs model specs

    A colleague was in the middle of developing an automated spec. There are generally two types of specs – integration and unit. In our case we use “system” and “model” specs. System specs test the behavior of the whole feature. Unit specs test the behavior of a specific unit (class, function). My general rule of thumb is – 10% system specs, 90% model specs, but start with the system spec. I’ve been in situations with too many system specs which make the system unmaintainable and require us to invest a lot of time in refactoring. Since then I am cautious about what kind of spec are developed, when and why. We revised current assumptions and decided if current specs should be developed as a system or unit.

    Flash messages

    A colleague was working on some flash messages on the platform that are appearing at a specific moment. I took a look and confirmed the implementation and the behavior.

    Constructing new objects

    A colleague was working on refactoring part of the code. A general rule of thumb I try to follow is “always construct instances of a given type” only at one specific place in the code. We revised the implementation and saw that there are a few places where instances of a given type were constructed. There is an easy solution for this. We schedule it for the following week to be implemented.

    Submit button change type to input

    A colleague was working on a feature on the web platform and noticed that a few of the forms had the wrong type of button. I was around and I was the one to previously commit this form so he notified me about the change and we discussed the implications.

    Structure of blob storage

    A colleague was working on an integration with an API that will store information in our BigData Lake. We had to sync the structure of the lake and how it will accommodate the new API.

    Infrastructure from code

    A colleague was working on deploying on a cloud provider. We try to create our infrastructure from code. It is way too easy to set up an infrastructure, spend a week deploying it, and then be unable to reproduce it later on because of the gazillion different options and little configurations you have to do on the cloud providers. Just ask anyone who has configured AWS IAM permissions and resources.

    It is important to have a script that would create your infrastructure from code. We had to revise, review and think about the implications of the resources that the code creates.

    Conclusion

    No actual conclusion. This is just a diary of the day. I hope that my brother along with many others now understand more about our work.

     
  • kmitov 9:48 am on August 27, 2021 Permalink
    Tags: customer, Software Development   

    How we lost $1000 because we did not talk to the customer early enough 

    This content is password protected. To view it please enter your password below:

     
  • kmitov 4:19 pm on June 13, 2021 Permalink |
    Tags: , Software Development, ,   

    Dependencies – one more variable adding to the “cost of the code” 

    One thing I have to explain a lot is what are the costs of software development. Why are things taking so long? Why is there any needed for maintenance and support? Why are developers spending significant amount of their time looking over the existing code base and why we can not just add the next and the next feature?

    Today I have an example of this – and these are “dependencies”.

    The goal of this article is to give people more understanding on how the “tech works.”. I’ve seen that every line of code and every dependency that we add to a project will inevitably result in further costs down the road so we should really keep free of unnecessary dependencies and features.

    Daily builds

    Many contemporary professional software projects have a daily build. This means that every day at least once the project is “built” from zero, all the tests are run and we automatically validate that the customers could use it.

    Weekly dependencies updates

    Every software project depends on libraries that implement common functionality and features. Having few dependencies is healthy for the project, but having no dependencies and implementing everything on your own is just not viable in today’s world.

    These libraries and frameworks that we depend on also regularly release new versions.

    My general rule that I follow in every project is that we check for new versions of the dependencies every Wednesday at around 08:00 in the morning. We check for new dependencies, we download them, we build the project and we run the specs/tests. If the tests fail this means that the new dependencies that we’ve downloaded have somehow changed the behavior of the project.

    Dependencies change

    Most of the time dependencies are changed in a way that does not break any of the functionality of your project. This week was not such a week. A new dependency came along and it broke a few of the projects.

    The problem came from a change in two dependencies:

    Fetching websocket-driver 0.7.5 (was 0.7.4)
    Fetching mustache-js-rails 4.2.0.1 (was 4.1.0)
    Installing mustache-js-rails 4.2.0.1 (was 4.1.0)
    Installing websocket-driver 0.7.5 (was 0.7.4) with native extensions
    

    We have installed new versions of two of the dependencies “websocket-driver” and “mustache-js-rails’

    These two dependencies broke the builds.

    Why should we keep up to date

    Now out of the blue we should resolve this problem. This takes time. Sometimes it is 5 minutes. Sometimes it could be an hour or two. If we don’t do it, it will probably result in more time at a later stage. As the change is new in ‘mustache-js-rails’ we have the chance to get in touch with the developers of the library and resolve the issue while it is fresh for them and they are still “in the context” of what they were doing.

    Given the large number of dependencies that each software project has there is a constant need to keep up to date with new recent versions of your dependencies.

    What if we don’t keep up to date?

    I have one such platform. We decided 6-7 years ago not to invest any further in it. It is still working but it is completely out of date. Any new development will cost the same as basically developing the platform as brand new. That’s the drawback of not keeping up to date. And it happens even with larger systems on a state level with the famous search for COBOL developers because a state did not invest in keeping their platform up to date for some 30+ years.

     
  • kmitov 7:48 am on February 3, 2021 Permalink |
    Tags: , , python, , Software Development,   

    Where is the redundancy? 

    (Everyday Code – instead of keeping our knowledge in a README.md let’s share it with the internet)

    I think calling a method thrее times is redundant. But then again, you have to balance. Today’s article is about a code review that at the end took like a few hours in total of different discussions and I believe it is important. This kind of things take time. Failed builds, difficult specs.

    The IRL example – Where is the redundancy? Is this DRY?

     @dataclass(frozen=True)
     class LdrawLine(abc.ABC):
    +    default_x: ClassVar[float] = 23
    +    default_y: ClassVar[float] = 45
    +    default_z: ClassVar[float] = 0
         """
             This is the abstract class which every LdrawLine should implement.
         """
    @@ -85,11 +88,19 @@ class LdrawLine(abc.ABC):
             elif line_args[1] == "STEP":
                 line_param = _Step()
             elif line_args[1] == "ROTSTEP":
    -            rostep_params = {
    -                "rot_x": float(line_args[2]),
    -                "rot_y": float(line_args[3]),
    -                "rot_z": float(line_args[4])
    -            }
    +            if line_args[2] == "END":
    +                rostep_params = {
    +                    "rot_x": LdrawLine.default_x,
    +                    "rot_y": LdrawLine.default_y,
    +                    "rot_z": LdrawLine.default_z,
    +                    "type": "ABS"
    +                }
    +            else:
    +                rostep_params = {
    +                    "rot_x": float(line_args[2]),
    +                    "rot_y": float(line_args[3]),
    +                    "rot_z": float(line_args[4])
    +                }
     
    

    This piece of code (along with a few other changes in the commit) were the root of a 2 hours discussion in the team. A spec failed because some things were int while we were expecting them to be float.

    Calling ‘float’ three times like this is redundant

    The reason I think like this is that if you have to change and would like to have a double value or an int value you would have to change the code in three places.

    Probably e better solution would be:

    +            if line_args[2] == "END":
    +                rostep_params = {
    +                    "rot_x": LdrawLine.default_x,
    +                    "rot_y": LdrawLine.default_y,
    +                    "rot_z": LdrawLine.default_z,
    +                    "type": "ABS"
    +                }
    +            else:
    +                rostep_params = {
    +                    "rot_x": line_args[2],
    +                    "rot_y": line_args[3],
    +                    "rot_z": line_args[4]
    +                }
                      # We are adding a loop to call the float
    +                for key in ["rot_x", "rot_y", "rot_z"]:
    +                     rotstep_params[key] = float(rotstep_params[key])
     
    

    We call float only once at a single place. Now we have to deal with the fact that we have “rot_x” as a variable in two places, yes, that is true, but this could easily be extracted and we can iterate over the rostep_params values. But now we have consistency.

    Is it harder to read? Probably it is a little harder. Instead of simple statements you now have a loop. So you lose something, but you gain something. A float function that is called at a single place.

    What are we doing with these rotsteps?

    ROTSTEP is a command in the LDR format. We support LDR for 3D building instructions. Here is one example with a FabBrix Monster that uses the LDR ROTSTEP as a command:

    FabBRIX Monsters, Cthulhu in 3D building instructions
     
  • kmitov 8:31 pm on January 28, 2021 Permalink |
    Tags: , , , Software Development   

    How the software becomes unmaintainable? – a practical example 

    (Everyday Code – instead of keeping our knowledge in a README.md let’s share it with the internet)

    An ugly, sometimes over neglected truth of the industry is that the software that we work on does not become unmaintainable and tedious and difficult to work with overnight. It becomes like this with every change we introduce. While each individual change might not be “that bad” when they pile up we no longer have a well decoupled, maintainable system. We end up with a mess, which we than re-write with a new framework, a new team, new concepts and architecture just trying to do better. But the problem most of the time is in the single change that we do today – is it making the software system better or worse?

    This article is about a practical example of today and how we got to stop it.

    To better or worse

    When developing software incrementally we introduce changes. There are basically two options – the new changes would make the system better or they will make it worse.

    Here is the change from today

    --- a/file1.rb
    +++ b/file1.rb
    @@ -8,6 +8,7 @@ module IsBackend
     
         def embed
    +      @full_video_src = @material.video_refs.where(usage_type: "full_video").first.try(:video).try(:source_url)
     
    diff --git a/file2.rb b/file2.rb
    index bce7cd1a0..43b3768ea 100644
    --- a/file2.rb
    +++ b/file2.rb
     
       before_action do
         @namespaces = [:author]
    +    @full_video_src = @material.video_refs.where(usage_type: "full_video").first.try(:video).try(:source_url)
       end
    
    diff --git a/file3.rb b/file3.rb
    index 650456cb0..8c772288a 100644
    --- a/file3.rb
    +++ b/file3.rb
    @@ -41,6 +41,8 @@ class MaterialsController < CommonController
         @client_import_path = "shared" if Rails.application.config.platform.id == "b3"
    +    @preview_video_src = @material.video_refs.where(usage_type: "preview").first.try(:video).try(:source_url)
    +    @full_video_src = @material.video_refs.where(usage_type: "full_video").first.try(:video).try(:source_url)
       end
     

    The logic is not important. We basically get the video for a material. What’s important is the code is practically the same in all 3 places. 4 calls in three files and they are all the same.

    In the past I was leading a class in Software Development. One thing I tried to teach each student was:

    When you copy and paste you introduce a bug. That’s a fact of the industry.

    The reason is that once you copy/paste you would have to support the same logic in more than one place and you would simply forget about the second place the next time you would like to change the logic. It might not be you, it might be colleagues working years from now on the same code, but they will forget to change both places. That’s the problem with redundancy.

    Remove the redundancy, and you remove most of the bugs

    How did we stop it?

    Simple review. Just ask a second person to check your commit. This simple review process protects you from a large portion of the other bugs that are still left after you’ve cleared all the redundancies (of course).

    Enjoy

    The code in question is part of the logic delivering these instructions. See how the video is displayed. That’s what the feature was about. Enjoy.

    Torvi – Ball shooting Lego machine
     
  • kmitov 7:51 am on January 19, 2021 Permalink |
    Tags: , , , , , Software Development,   

    Same code – two platforms. With Rails. 

    (Everyday Code – instead of keeping our knowledge in an README.md let’s share it with the internet)

    We are running two platforms. FLLCasts and BuildIn3D. Both platforms are addressing entirely different problems to different sets of clients, but with the same code. FLLCasts is about eLearning, learning management and content management, while BuildIn3D is about eCommerce.

    What we are doing is running both platforms with the same code and this article is about how we do it. The main purpose is to give an overview for newcomers to our team, but I hope the community could benefit from it as a whole and I could get feedback and learn what others are doing.

    What do we mean by ‘same code’?

    FLLCasts has a route https://www.fllcasts.com/materials. Results are returned by the MaterialsController.

    BuildIn3D has a route https://platform.buildin3d.com/instrutions. Results are returned by the same MaterialsController.

    FLLCasts has things like Organizations, Groups, Courses, Episodes, Tasks which are for managing the eLearning part of the platform.

    BuildIn3D has none of these, but it has WebsiteEmbeds for the eCommerce stores to embed and put 3D building instructions and models on their E-commerce stores.

    We run the same code with small differences.

    Do we use branches?

    No, we don’t. Branches don’t work for this case. They are hard to maintain. We’ve tried to have an “fc_dev” branch and a “b3_dev” branch for the different platforms, but it gets difficult to maintain. You have to manually merge between the branches. It is true that Git has made merging quite easy, but still it is an “advanced” task and it is getting tedious when you have to do it a few times a day and to resolve conflicts almost every time.

    We use rails engines (gems)

    We are separating the platform in smaller rails engines.
    A common rails engine between FLLCasts and BuildIn3D is called fc-author_materials. It provides the functionality for and author to create a material both on FLLCasts and on BuildIn3D.

    The engine providing the functionality for Groups for the eLearning part of FLLCasts is called fc-groups. This engine is simply not installed on BuildIn3D, we install it only on FLLCasts.

    How does the Gemfile look like?

    Like this:

    install_if -> { !ENV.fetch('CAPABILITIES','').split(",").include?('--no-groups') } do
      gem 'fc-groups_enroll', path: 'gems/fc-groups_enroll'
      gem 'fc-groups', path: 'gems/fc-groups'
    end
    

    We call them “Capabilities”. By the default each platform is started with a “Capability” of having Groups. But we can disable them and tell the platform to start without Groups. When the platform starts the Groups are simply not there. We

    How about config/routes.rb?

    The fc-groups engine installs its own routes. This means that the main platform config/routes.rb is different from gems/fc-groups/config/routes.rb and the routes are installed only when the engine is installed.

    Another option is to have an if statement and to check for capabilities in the config/routes.rb. We still have to decide which is easier to maintain.

    Where do we keep the engines? Are they in a separate repo?

    We tried. We have a few of the engines in separate repos. With time we found out it is easier to keep them in the same repo.

    When the engines are in separate repos you have very strict dependencies between them. This proves to be useful but costs a lot in terms of development and creating a clear API between the engines. This could pay off when we would like to share the engines with the rest of the community like for example Refinery is doing. But, we are not there, yet. That’s why we found out we could spend the time more productively developing features instead of discussing which class goes where.

    With the case of all the rails engines in a single repo we have the mighty Monolith again, we have to be grown ups in the team and maintain it, but it is easier than having them in different repos.

    How do we configure the platforms?

    FLLCasts will send you emails from team [at] fllcasts [dot] com

    BuildIn3D will send you emails from team [at] buildin3d [dot] com

    Where is the configuration?

    The configuration is in the config/application.rb. The code looks exactly like this:

    platform = ENV.fetch('FC_PLATFORM', 'fc')
    if platform == 'fc'
      config.platform.sender = "team [at] fllcasts [dot] com"
    elsif platform == 'b3'
      config.platform.sender = "team [at] buildin3d [dot] com"
    

    When we run the platform we set and ENV variable called FC_PLATFORM. If the platform is “fc” this means “FLLCasts”. If the platform is “b3” this means “BuildIn3D”.

    In the config/environments/production.rb we are referring to Rails.application.config.platform.sender. In this way we have one production env for both platforms. We don’t have many production evns.

    Why not many production envs?

    We found out that if we have many production envs, we would also need many dev envs and many test envs and there will be a lot of duplication between them.

    That’s why we are putting the configuration in the application.rb. It’s about the application, not the environment.

    How do we deploy on heroku?

    First rule is – when you deploy one platform you also deploy the other platform. We do not allow different versions to be deployed. Both platforms are running the same code, always. Otherwise it gets difficult.

    When we deploy we do

    # In the same build we 
    # push the fllcasts app and then the buildin3d app
    git push fllcasts production3:master
    heroku run rake db:migrate --app fllcasts
    
    git push buildin3d production3:master
    heroku run rake db:migrate --app buildin3d

    In this way both platforms always share the same code, except for a short period of a few minutes between the deployment.

    How are the views separated?

    The platforms share a controller, but the views are different.

    The controller should return different views for the different platforms. Here is what the controller is doing.

    def show
       # If the platform is 'b3' return a different set of views
        if Rails.application.config.platform.id == "b3"
          render :template => Rails.application.config.platform.id+"/materials/show"
        else
          render :template => "/materials/show"
        end
      end

    In the same time we have the folders:

    # There are two separate folders for the views. One for B3 and one for FC 
    fc-author_materials/app/views/b3/materials/
    fc-author_materials/app/views/materials/

    How do we test?

    Testing proved to be challenging at first. Most of the time as the code is the same the specs should be the same right?

    Well, no. The code is the same, but the views are different. This means that the system specs are different. We write system and model specs and we don’t write views and controllers specs (if you are still writing views and controllers specs you should consider stopping. They were deprecated years ago).

    As the views are different the system specs are different.

    We tag the specs that are specifically for the BuildIn3D platform with a tag platform:b3

    
      context "platform is b3", platform: :b3 do
        before :each do
          expect_b3
        end

    When we run the specs we run first all the specs that are not specifically for b3 with

    $ rake spec SPEC="$specs_to_build" SPEC_OPTS="--tag ~platform:b3 --order random"

    Then we run a second suite for the tests that are specifically for the BuildIn3D platform.

    # Note that here we set an ENV and we use platform:b3 and not ~platform:b3
    $ FC_PLATFORM="b3" rake spec SPEC="$specs_to_build" SPEC_OPTS="--tag platform:b3 --order random"

    I see how this will become difficult to maintain if we start a third platform or a fourth platform, but we would figure it out when we get there. It is not something worth investing any resources into as we do not plan to start a new platform soon.

    Conclusion

    That’s how we run two platforms with the same code. Is it working? We have about 200 deployments so far in this way. We see that it is working.

    Is it difficult to understand? It is much easier than different branches and it is also much easier than having different repos.

    To summarize – we have a monolith app separated in small engines that are all in the same repo. When running a specific platform we install only the engines that we need. Controllers are the same, views could be different.

    I hope this was helpful and you can see a way to start a spin off of your current idea and to create a new business with the same code.

    There is a lot to be improved – it would be better to have each and every rails engine as a completely separate project in a different repo that we just include in the platform. But we still don’t have the requirement for this and it will require months of work on a 10 years old platform as ours. Once we see a clear path for it to pay off, we would probably do it in this way.

    For fun

    Thanks for stopping by and reading through this article. Have fun with this 3D model and building instructions.

    GeoShpere3 construction with GeoSmart set

     
  • kmitov 5:25 pm on January 18, 2021 Permalink |
    Tags: , , , , , Software Development, ,   

    The benefits of running specs against nightly releases of dependencies. 

    Spend some time and resources to set up your Continuous Integration infrastructure to run your spec suites against nightly releases of your dependencies. The benefits are larger than the costs.

    Context

    To further explain the point I will use an example from today.

    We run our specs daily against the latest nightly release of BABYLON.js. On Friday one spec failed. I reported in the forum (not even a github issue). A few hours later there was a fix and PR merged with the main branch of BABYLON.js. We would have the new nightly in a day or two.

    Our specs pass with version 4.2.0 of BABYLON.js, but they fail with BABYLON 5.0.0-alpha.6. A few of the hundred of extensions running in the Instructions Steps (IS) Framework are using BABYLON.js. The IS framework is powering the 3D building instructions at FLLCasts and BuildIn3D.

    BABYLON.js provides two releases of their library.

    1. Stable – available on https://cdn.babylonjs.com/babylon.js
    2. Preview – available on https://preview.babylonjs.com/babylon.js

    How do we run the specs against the preview (nightly) release of BABYLON.js?

    We’ve configured Jenkins to do two builds. One is against the official release of BABYLON.js that we are using on production. The second run is against the preview release.

    When there is a problem in our code both builds will fail. When there is an issue with the new version of BABYLON.js only the second build fails.

    What is the benefit?

    I think of the benefit as “being in the context’. Babylon team is working on a new feature or they are changing something. If we find an issue with this change six months later it would be much more difficult for them to switch context and resolve it. Probably there are already other changes. But when we as developers are in the “context”, when we are working on something and have made a change today and there is an issue with this change it is much easier to see where the problem is. You are in the same context.

    The other large benefit is that when 5.0.0 is released we will know from day one that we support it and we can switch production to the new version. There are exactly 0 days for us to “migrate” to the new version.

    How much does it cost us?

    Basically – zero. The specs are run in under 60 seconds and the build is configured with a param.

    What if there are API changes?

    Yes, we can’t just run the same code if there are API changes in BABYLON.js. That’s why we have the branch. If there are API changes we can change our code in the babylon-5.0 branch and keep it up to date with changes in dev, which is most of the time resolved with a simple merge.

    But BABYLON.js is a stable library. There are not many API changes that are happening. At least not in the API that we are using.

    For fun

    As you are here, here is one instruction

    Large Spaceball from Geosmart Spaceball set in 3D
     
  • kmitov 5:36 pm on November 17, 2020 Permalink |
    Tags: Software Development,   

    Technical (code) Debt and how we handle it. 

    The subject of technical debt is interesting for me. I recently got a connection on linkedin offering me to help us identify, track and resolve technical debt and this compelled me to further write this article and to give some perspective on how we manage technical debt in our platforms and frameworks by specifically stopping on a few examples from the fllcasts.com and buildin3d.com platforms along with the Instructions Steps (IS) framework that we are developing. Hope it is useful for you all.

    What is technical debt?

    Here is the definition on the first source of wikipedia. It is pretty straightforward –

    Technical debt is a concept in programming that reflects the extra development work that arises when code that is easy to implement in the short run is used instead of applying the best overall solution.

    (https://www.techopedia.com/definition/27913/technical-debt)

    What does it look like?

    The way I think about technical debt is – we write certain structures like for example “if” which in many cases is a Technical Debt. Of course there are many other types, but I will stop at this one for the article.

    // In this example we do a simple if for a step  when visualizing 3d assembly instructions
    // if there is animation for the specific step created by the author and persisted in the file we play this animation, but if there is no animation in the file we create a default animation.
    if(step.hasAnimation()) {
     return new AnimationFromFile(step)
    } else { 
     return new DefaultAnimation(step)
    }

    The problem with technical dept here is that there later in the development of the project two new cases might arise.

    Case 1 – we want to provide the functionality for AnimationFromFile(step) only to paying customers, otherwise we return just the default animation. The code then becomes:

    // We check if the user is subscribed and only then provide the better experience
    if(step.hasAnimation() && customer.hasSubscription()) {
     return new AnimationFromFile(step)
    } else { 
     return new DefaultAnimation(step)
    }

    Why is this bad? Where is the debt? – the debt is that we are now coupling the logic for playing animation with the logic for customers that have subscriptions. This means that when there are changes to the subscription logic and API we must also change the logic for handling animations.

    Case 2 – we introduce a third type of animation that is only for users with a certain WebGL feature in their browser. The code becomes:

    // We check if the browser supports the WebGL feature in question and then return a FancyAnimation.
    if(step.hasAnimation() && customer.hasSubscription()) {
     return new AnimationFromFile(step)
    } else {
     if (webGlFeaturePresent()) {
       return new FancyAnimation();
     else { 
       return new DefaultAnimation(step)
     }
    }

    Now we have a logic that knows about creating DefaultAnimation, about reading from files, about what a subscription is and when are users subscribed and it also knows much about the browsers and their support for WebGL.

    At a certain point in time we would have to refactor this logic and separate it into more decoupled pieces. That is a technical debt.

    How was Technical Debt created (in the example above)?

    We took the easy path now by placing one more if in the logic, but we knew that at some point we would have to refactor the logic.

    Should we omit Technical Debt?

    I think a good architecture could prevent a lot of the technical debt that is occurring. Good decoupled architecture with small units with clear boundaries and no state will result in 0 technical debt and we should strive to create such systems. Practically the world is not perfect. In a team of engineers even if you spend all your time on fighting with Technical Debt it is enough for only one colleague at one instance to take the easy path and add one more if to “fix this in 5 minutes instead of 3 hours” and the Technical Debt is already there. You’ve borrowed from the future.

    How do we track technical debt?

    I have personally learned to live with some technical debt. If I now do

    $ git grep "FIXME" 

    in one of our platforms we would get 37 results. These are 37 places where we think we could have made the implementation better and we even have some idea how, but we’ve actively decided that it is not the time now for this. Probably this part of the code is not used often enough. Or we are waiting for specific requirements from specific clients to come and we would address this them – when there is someone to “pay for it”. Can we address it now? – but of course. It would take us two, three days, but the question is why? Why should we address this now. Would it bring us more customers, would it bring more value to the customers? It would surely make our life easier. So it is a balance.

    Our balance

    I can summarize our balance like this.

    1. We identify part of the code as a technical Debt (because we do regular code reviews).
    2. We try to look at “what went wrong” and understand how this could be implemented better. We might even try in a different branch but we do not spend that much resources on this.
    3. We then know “what when wrong” and we agree to be more careful and not to take debt the next time but to instead implement it in the right way the first time.
    4. After that we decide if it is worth it to refactor the current issue – are the new clients coming that would ask us for modifications on these parts of the code?

    That’s it.

    Simple “FIXME”, “TODO”, “NOTE”, “IMPORTANT”, “SECURITY” tags in the code, git grep to see where we are and balance with trying to learn how to do it correctly next time.

    How can we solve Technical Debt for the example above?

    In buildin3d we have a framework with an event-driven plugin architecture. So for us it was simply a matter of registering a different plugin for the different features.

    // Pseudo code is 
    framework.register(new FancyAnimation())
    framework.register(new AnimationFromFile()) 
    framework.register(new DefaultAnimationExtension())
    
    framework.ariveOnStep((step)=> {
      ...ask all the extensions for animation and play the first animation that is returned
    }})

    The question is at which stage do you invest in a framework.

    What about MVP(s)?

    The greater balance is sometimes between an MVP and a working product. On one occasion we had an open source tool that was doing exactly what we needed. It was converting one 3D file format to a different 3D file format. We started the project. We used the open source tool. We delivered a working MVP in about a month and we took a lot of debt, because this tool came with other dependencies and was clearly not developed to be supported and extended. It was clear from the beginning that once new client requirements started coming we would have to re-write almost everything. And we waited. We waited for about 2 years. For 2 years we were extending the initial implementation and one day a client came with a requirement that we could no longer support. Then, it took us about 6 months to re-write the whole implementation in a completely new, much more extensible way and could easily accommodate new requirements.

    Conclusion

    Try not to take technical debt.

    If you have to then at least try to learn why it happens and learn how not to do it in the future. You will exponentially become better.

    Write down a comment in the code about why do you think this is a debt and how it should be approached. Spend some time reviewing and resolving debts if it pays off.

     
  • kmitov 6:48 am on November 2, 2020 Permalink |
    Tags: , , Software Development   

    A week ago I gave a nice lecture about Google Closure Compiler and how to use it in ADVANCED_OPTIMIZATION mode. It is available in Bulgarian at https://softuni.bg/trainings/3194/advance-javascript-compilation-with-google-closure-compiler-why-and-how

     
  • kmitov 5:38 am on October 19, 2020 Permalink |
    Tags: management, Software Development   

    Don’t fix the issue in the software. Improve the process. 

    Yesterday one of the features on our platform did not work. I was in a meeting, demonstrating it over a shared screen and talking with a potential client. I went to the page showing the IS Editor in our buildin3d.com platform and the editor for editing the assembly instructions did not start. A little rush of embarrassment and a few milliseconds later I knew what I had to do. Thanks to my seniority and extended experience in the world of web development I moved my fingers lighting fast on the keyboard and I refreshed the page. The editor started. The demonstration continue.

    I could remember that I stumbled upon this issue a few days earlier and I saw that the IS Editor was not loading when you first visit the page. The meeting continue, I said something like “Sometimes when we are sharing the screen my bandwidth is small so we have to wait”. I suppose the client did not exactly understood what has just happen, but what I know is that the next time they try it on their side it will not work and they will be disappointed.

    Right after the meeting there was a problem I was facing. Should I now open the repo and start debugging or should I wait a day or two for our team to look at this.

    One of the most difficult things running a Software company as a good software developer is the patience to wait for the team of developers to resolve an issue.

    I was close to mad. How difficult could it be? After you commit something just go to the platform and see that it works. We have a lot of automation, a lot of testing and spec that have helped us a lot. We have a clean and I would say quite fast process for releasing a new version of any module to the platform. It takes anywhere from 2 minutes to about 20 minutes depending on what you are releasing. So after you release something just go and see and test and try it and make sure it works. How difficult could it be?

    I was mad. Like naturally and really mad. Not that this demonstration was almost ruined by this issue. I was mad that we’ve spend about 3-4 months working on this editor and it currently does not start. It is not true that the editor itself is not working. It is just not starting. Once it starts it works flawlessly, but a mis-configuration in the way it is started prevents it from even starting.

    It’s like getting to your Ferrary and it does not start because of law battery on your key or something. There is nothing wrong with the Ferrary itself, but your key is not working.

    In this state of anger I opened up the repo. I tracked down the moment it was introduced. And here is the dilemma:

    1. Should I now start debugging it, and resolving it?
    2. Should I just revert the last 11 days of commits and return the platform to a previous state completely removing the great improvements we’ve introduced in this last 11 days?
    3. Should I leave it for the next few days for the team to look at?

    The worst part is that I can fix the issue myself. But that is not my job. My team counts on me to spend more of my time with potential & existing clients, talking and discussing with them. Looking for ways they could integrate us. But in the same time I had an issue where a major feature is not working and will not work for the next few days and in one sleepless night I could resolve it.

    I don’t have this problem with the other departments. When there is an issue with some of the 3D product animations and models or there is an issue with some of the engineering designs I do not feel the urge to go and resolve this issue. I have the patience to rely on the team for this. Basically because I lack the knowledge and the tools to resolve such issues.

    Years ago when we were starting with 3D animations and models I had great interest, but I openly refused to install any software about 3D animations and models on my machine. I knew myself and I knew my team. In school an in university and was trying some 3D models and animations and it felt great. I learned a lot and I had some great time working on such projects. So I knew that if I install some of the software on my machine there will be issue that will come to me, but that was not my role in my organization.

    Same for engineering. I have the complete patience to wait for days for an engineering design task to complete. I never start the SOLIDWORKS myself and go on and “fix the things”. I could. I just don’t want as it will distract me from other important things and I know I can count on the engineers to do it.

    But with software it is always a little difficult. Not that I can not delegate. I can. There are large parts of the code we are running that I have never touched, or changed or anything. So I though – why was this particular issue different? What was my problem? Why was it bothering me? Why was this different from any other issue in software development that is reported, debugged and resolved. Where did the anger come from?

    I was angry because the process I’ve setup has allowed for this issue to occur.

    The IS Editor was working a few days ago. Now it was not working. This was not an issue of my software development skills, this was a challenge for my “organizing a software development process that produces a working software and deploys it to production a few times a day in a team with a large code base and a new R&D challenge that we were working on”.

    This I have found in my experience to be the most difficult problem for good software developers that mediocre and bad software developers do not face. When you know how to fix it, how to implement it and you take on the task then your time and energy is spend on resolving the issue. It might be better for the team as a whole if you spend your energy and resources on a different tasks – like how to avoid a regression in a multi-teams multi-frameworks environment.

    Know what is important and where your efforts would be most valuable. I’ve stepped up and did a lot of software development int he team. I’ve single-handedly implemented a number of frameworks. Not just the architecture, but actual implementation. I once deleted two human years of development and re-implemented the whole module almost from scratch. There is even a saying in the team “Kiril will roll up his sleeves and will implement this”.

    But no.

    There will always be issues in software development and we should think if our task is to resolve this issues, or to make sure this issues never occur in the first place. The later is objectively the more important and difficult task.

     
c
compose new post
j
next post/next comment
k
previous post/previous comment
r
reply
e
edit
o
show/hide comments
t
go to top
l
go to login
h
show/hide help
shift + esc
cancel