Updates from June, 2021 Toggle Comment Threads | Keyboard Shortcuts

  • kmitov 6:41 am on June 5, 2021 Permalink |
    Tags: , , , ,   

    Yet another random failing spec 

    (Everyday Code – instead of keeping our knowledge in a README.md let’s share it with the internet)

    This article is about a random failing spec. I spent more than 5 hours on this trying to track it down so I decided to share with our team what has happened and what the stupid mistake was.

    Random failing

    Random failing specs are most of the time passing and sometimes failing. The context of their fail seems to be random.

    Context

    At FLLCasts.com we have categories. There was an error when people were visiting the categories. We receive each and every error on an email and some of the categories stopped working, because of a wrong sql query. After migration from Rails 6.0 to Rails 6.1 some of the queries started working differently mostly because of eager loads and we had to change them.

    The spec

    This is the code of the spec

     scenario "show category content" do
        category = FactoryBot.create(:category, slug: SecureRandom.hex(16))
        episode = FactoryBot.create(:episode, :published_with_thumbnail, title: SecureRandom.hex(16))
        material = FactoryBot.create(:material, :published_with_thumbnail, title: SecureRandom.hex(16))
        program = FactoryBot.create(:program, :published_with_thumbnail, title: SecureRandom.hex(16))
        course = FactoryBot.create(:course, :published_with_thumbnail, title: SecureRandom.hex(16))
    
        category.category_content_refs << FactoryBot.create(:category_content_ref, content: episode, category: category)
        category.category_content_refs << FactoryBot.create(:category_content_ref, content: material, category: category)
        category.category_content_refs << FactoryBot.create(:category_content_ref, content: program, category: category)
        category.category_content_refs << FactoryBot.create(:category_content_ref, content: course, category: category)
    
        expect(category.category_content_refs.count).to eq 4
        visit "/categories/#{category.to_param}"
    
        find_by_xpath_with_page_dump "//a[@href='/tutorials/#{episode.to_param}']"
        find_by_xpath_with_page_dump "//a[@href='/materials/#{material.to_param}']"
        find_by_xpath_with_page_dump "//a[@href='/programs/#{program.to_param}']"
        find_by_xpath_with_page_dump "//a[@href='/courses/#{course.to_param}']"
    
      end

    We add a few objects tot he category and then we check that we see them when we visit the category.

    The problem

    Sometime while running the spec only 1 of the objects in the category are shown. Sometimes non, most of the time all of them are shown.

    The debug process

    The controller

    def show
      @category_content_refs ||= @category.category_content_refs.published
    end

    In the category we just call published to get all the published content that is in this category. There are other things in the show but they are not relevant. We were using apply_scopes, we were using other concerns.

    The model

      scope :published, lambda {
        include_contents.where(PUBLISHED_OR_COMING_WHERE_SQL)
      }

    The scope in the model makes a query for published or coming.

    And the query, i kid you not, that was committed in 2018 and we’ve had this query for so long was

    class CategoryContentRef < ApplicationRecord
       
        PUBLISHED_OR_COMING_WHERE_SQL = ' (category_content_refs.content_type = \'Episode\' AND (episodes.published_at <= ? OR episodes.is_visible = true) ) OR
         (category_content_refs.content_type = \'Course\' AND courses.published_at <= ?) OR
         (category_content_refs.content_type = \'Material\' AND (materials.published_at <= ? OR materials.is_visible = true) ) OR
         category_content_refs.content_type=\'Playlist\'', *[Time.now.utc.strftime("%Y-%m-%d %H:%M:%S")]*4].freeze
    
    end
    

    I will give you a hit that the problem is with this query.

    You can take a moment a try to see where the problem is.

    The query problem

    The problem is with the .freeze and the constant in the class. The query is initialized when the class is loaded. Because of this it takes the time at the moment of loading the class and not the time of the query.

    Because the specs are fast sometimes the time of loading of the class is right before the spec and sometimes there are specs executed in between.

    It seems simple once you see it, but these are the kind of things that you keep missing while debugging. They are right in-front of your eyes and yet again sometimes you just can’t see them, until you finally see them and they you can not unsee them.

     
  • kmitov 5:38 am on October 19, 2020 Permalink |
    Tags: management,   

    Don’t fix the issue in the software. Improve the process. 

    Yesterday one of the features on our platform did not work. I was in a meeting, demonstrating it over a shared screen and talking with a potential client. I went to the page showing the IS Editor in our buildin3d.com platform and the editor for editing the assembly instructions did not start. A little rush of embarrassment and a few milliseconds later I knew what I had to do. Thanks to my seniority and extended experience in the world of web development I moved my fingers lighting fast on the keyboard and I refreshed the page. The editor started. The demonstration continue.

    I could remember that I stumbled upon this issue a few days earlier and I saw that the IS Editor was not loading when you first visit the page. The meeting continue, I said something like “Sometimes when we are sharing the screen my bandwidth is small so we have to wait”. I suppose the client did not exactly understood what has just happen, but what I know is that the next time they try it on their side it will not work and they will be disappointed.

    Right after the meeting there was a problem I was facing. Should I now open the repo and start debugging or should I wait a day or two for our team to look at this.

    One of the most difficult things running a Software company as a good software developer is the patience to wait for the team of developers to resolve an issue.

    I was close to mad. How difficult could it be? After you commit something just go to the platform and see that it works. We have a lot of automation, a lot of testing and spec that have helped us a lot. We have a clean and I would say quite fast process for releasing a new version of any module to the platform. It takes anywhere from 2 minutes to about 20 minutes depending on what you are releasing. So after you release something just go and see and test and try it and make sure it works. How difficult could it be?

    I was mad. Like naturally and really mad. Not that this demonstration was almost ruined by this issue. I was mad that we’ve spend about 3-4 months working on this editor and it currently does not start. It is not true that the editor itself is not working. It is just not starting. Once it starts it works flawlessly, but a mis-configuration in the way it is started prevents it from even starting.

    It’s like getting to your Ferrary and it does not start because of law battery on your key or something. There is nothing wrong with the Ferrary itself, but your key is not working.

    In this state of anger I opened up the repo. I tracked down the moment it was introduced. And here is the dilemma:

    1. Should I now start debugging it, and resolving it?
    2. Should I just revert the last 11 days of commits and return the platform to a previous state completely removing the great improvements we’ve introduced in this last 11 days?
    3. Should I leave it for the next few days for the team to look at?

    The worst part is that I can fix the issue myself. But that is not my job. My team counts on me to spend more of my time with potential & existing clients, talking and discussing with them. Looking for ways they could integrate us. But in the same time I had an issue where a major feature is not working and will not work for the next few days and in one sleepless night I could resolve it.

    I don’t have this problem with the other departments. When there is an issue with some of the 3D product animations and models or there is an issue with some of the engineering designs I do not feel the urge to go and resolve this issue. I have the patience to rely on the team for this. Basically because I lack the knowledge and the tools to resolve such issues.

    Years ago when we were starting with 3D animations and models I had great interest, but I openly refused to install any software about 3D animations and models on my machine. I knew myself and I knew my team. In school an in university and was trying some 3D models and animations and it felt great. I learned a lot and I had some great time working on such projects. So I knew that if I install some of the software on my machine there will be issue that will come to me, but that was not my role in my organization.

    Same for engineering. I have the complete patience to wait for days for an engineering design task to complete. I never start the SOLIDWORKS myself and go on and “fix the things”. I could. I just don’t want as it will distract me from other important things and I know I can count on the engineers to do it.

    But with software it is always a little difficult. Not that I can not delegate. I can. There are large parts of the code we are running that I have never touched, or changed or anything. So I though – why was this particular issue different? What was my problem? Why was it bothering me? Why was this different from any other issue in software development that is reported, debugged and resolved. Where did the anger come from?

    I was angry because the process I’ve setup has allowed for this issue to occur.

    The IS Editor was working a few days ago. Now it was not working. This was not an issue of my software development skills, this was a challenge for my “organizing a software development process that produces a working software and deploys it to production a few times a day in a team with a large code base and a new R&D challenge that we were working on”.

    This I have found in my experience to be the most difficult problem for good software developers that mediocre and bad software developers do not face. When you know how to fix it, how to implement it and you take on the task then your time and energy is spend on resolving the issue. It might be better for the team as a whole if you spend your energy and resources on a different tasks – like how to avoid a regression in a multi-teams multi-frameworks environment.

    Know what is important and where your efforts would be most valuable. I’ve stepped up and did a lot of software development int he team. I’ve single-handedly implemented a number of frameworks. Not just the architecture, but actual implementation. I once deleted two human years of development and re-implemented the whole module almost from scratch. There is even a saying in the team “Kiril will roll up his sleeves and will implement this”.

    But no.

    There will always be issues in software development and we should think if our task is to resolve this issues, or to make sure this issues never occur in the first place. The later is objectively the more important and difficult task.

     
  • kmitov 9:09 am on March 11, 2020 Permalink |
    Tags: awesome, , ,   

    They will leave you, if they tell you everything is fine – [Everyday Company Culture] 

    Culture is how we do stuff around here. Read more about the [Everyday Company Culture] series.

    Today’s resolution. TL;DR

    If team members tell you everything is fine, it’s like a sure sign they will leave you.

    Today’s case

    People leave your organization for different reasons. One pattern I’ve noticed, learned and observed through the years is that people leave when “everything is fine” and people stay when there is a “list of things that are wrong”. People stay for “something” to fight for, because “something” is not right. People don’t stay because “everything is awesome”.

    As a manager be careful when team members return a positive feedback. Of course you want a positive feedback and you want people to be happy. But this does not mean there are no things to work towards to. Get feedback constantly, list things that are still not OK and work towards improving and resolving them.

    Root cause

    “Everything is fine” could mean that people don’t care enough to point the problems.

    Full story

    A team member of our organization left us yesterday. He was a great guy with great potential. As a software product company we are working on developing frameworks and solutions and this requires a great set of skills. It is certainly not a regular software development job. We are the ones building the platforms and the frameworks and providing them to clients and other developers. My experience is that not many people could grasp the concepts, importance and rules when building things like a sustainable API. This guy did, but he left us yesterday as he wanted to explore a different career path. (yek, this is more corporate sounding that I am comfortable with, but you get the point)

    Anyway, this finally prompt me to write this article and get deeper into. I hope members in and out of our organization could benefit by identifying and approaching such cases. Looking even at my personal experience as I have left two companies in my career, I did left when everything was kind of ok.

    Feedback and retrospective is a must

    We do regular retrospectives and I try to get more formalized feedback from team members. This happens once every 1-2 months with a simple Trello card where everybody should answer a few simple questions like:

    1. What are we doing wrong
    2. What are we doing right
    3. What could we improve

    Team members answer these questions publicly or privately to their managers.

    We are also doing regular 1-on-1 discussions (because we are not doing meetings…) and talk about how our organization is doing. The means of receiving feedback could be different, but it is important to give and receive feedback regularly. Even on a daily basis. Of course, you don’t have to solve every issue over lunch, but you could take a note (literally take a note and write it down) if in a conversation someone is mentioning that they have a “problem with something”. After all your job is to address this.

    On these regular retrospectives I try to be the first to return feedback and be honest of where we are, what have we done and what we could improve.

    67 things we could improve – co-founder

    On one of this discussions we sat down with my co-founder and we started sharing all the things that we don’t like. We wrote them down. Every one of them. One of them was of course “we should make more money”. But the others 66 were basically not connected with money at all and were rather small things in the whole organization. So we were happy that we have a basics right and in the large picture we are ok and moving in the right direction. But there were 66 things we came up with in about an hour that we thing we should improve.

    Example – one of them was “more A/B testing” before taking decisions. Since then we have done about 15 different tests that help us learned more and we are doing more structured A/B testing. Yet still, if currently we have to list the things that we could improve “more A/B testing” would make it to the list as there are still some decisions that we make based on a “feeling, preference and common sense”, which could be correct, but are not based on data.

    Point is – there are still things we could improve and we want to improve and we want to stay for.

    ‘I am a problem’ – VP of Engineering

    I’ve constantly received feedback from our VP of Engineering. He is sharing it publicly to the whole team and is somewhere on the lines of “I think that the organization is great, but It seems that I am not doing enough and I am trying to change this,…”. Here again – other things are great, of course we could always get more clients, but it is “me that is not doing enough and I am working on fixing this”. There are few things better than hearing your VP of Engineering, that has the largest engineering experience than any other in the organization, come to the office and say “Yesterday I learned how we could do this thing x2 times faster”

    ‘Our code needs to improve’

    As a software company we are working on our code. In one of our latest products we have one of the most clean code that I’ve ever seen. We could deliver in minutes. Modularity is great. Concerns are separated. Almost no redundancy and coupling. Automatic tests for different modules run in seconds. It is just really fun.

    Yet I’ve constantly received feedback that “our code needs to improve”. Because there is always room for improvement. This tests that are running for 15 seconds could now run for 5. This class could be renamed to make it more clear. This method could be called with different params and refactored. And the code could be made even more beautiful. We have a backlog of about 81 things that are exactly this. Not features, but just things to improve and make even better. So we prioritize and work on them in order. As we work we would learn more and the backlog grows even further and that’s ok.

    Point is, there are things that we are working towards to in the code. The code is never fine.

    “Everything is OK” and why at least ONE thing is not OK.

    Compare the above three examples with an “Everything is fine” feedback. No, it is not fine. It can never be fine. You could be doing too much meetings. Probably you are. Or sending to much emails. Probably you are. You could have unclear responsibilities. You could have people on positions that are not appropriate for them.

    Again, the point is that even if everything else in your organization is fine internally, even if you are the greatest “leader of all times”, even when people are earning enough, there is one thing that is still not fine. This thing is the reason your organization exist and the reason you’ve gathered your team. In our case – “people get confused and feel a lot of frustration when they don’t have an easy to follow 3D assembly instructions available instantly on their device”. This is the reason we are developing our products. After all, you’ve gathered together as a team to tackle a problem and even if everything else is “OK”, it is still not OK that there are people feeling pain and in need of your medicine.

     
  • kmitov 6:34 am on February 21, 2020 Permalink |
    Tags: ,   

    Everyday Company Culture – what’s it about 

    Culture is how we do stuff around here. Every day there is a new challenge with communication, transparency, accountability, with what we value and what we think of the world (mainly inside of our organization).

    With the Everyday Company Culture series I am trying to look at specific cases, the root cause of the challenge and how we approach them. This approach is different form setting X high level, single word values that are understood differently by each member in organization. Example – transparency. It is easy to aim for a ‘transparency’ in an organization. It is difficult to properly communicate what is ‘transparency’. Same for ‘diversity’, ‘accountability’, ‘clarity in communication’, etc.

    My hope is that this series could bring more value inside of our organization and hopefully help other small and large teams work better.

    Check out the full series at http://kmitov.com/posts/category/everyday-company-culture/

     
  • kmitov 7:17 am on February 19, 2020 Permalink |
    Tags: communication, , team   

    Why you should make little assumptions on what team members have understood – [Everyday Company Culture] 

    Culture is how we do stuff around here. Every day there is a new challenge with communication, transparency, accountability, with what we value and what we think of the world (mainly inside of our organization).

    With the Everyday Company Culture series I am trying to look at specific cases, the root cause of the challenge and how we approach them. This approach is different form setting X high level, single word values that are understood differently by each member in organization. Example – transparency. It is easy to aim for a ‘transparency’ in an organization. It is difficult to properly communicate what is ‘transparency’. Same for ‘diversity’, ‘accountability’, ‘clarity in communication’, etc.

    My hope is that this series could bring more value inside of our organization and hopefully help other small and large teams work better.

    Today’s resolution. TL;DR

    Don’t assume that people know. Ask for confirmation especially when things seem trivial to you. If things are not trivial you would have spend time to properly communicate them. But when things are trivial from your point of view you could make a large assumption that they are also trivial for others.

    Today’s case

    “A Robotics field that costs about 150 euros and 2 days to assemble is lost and it would take about $500 and 2 days now to restore it”

    Root cause

    Assumption. Communication.

    We made a wrong assumption that a new team member understands a term that we’ve been using in our team for the last 7 years. We as a team made the assumption that the new member is familiar with how things are done in our team. We’ve never been explicit and even though about being explicit about this specific topic.

    Full Story

    Ivo is part of the team for about 1 year. He was using a robotics field for a project he has been working and has already completed. The field has about 15 models placed on it. Ivo is asked a few times by the rest of the team “When is he going to disassemble the field, because it takes a lot of space. He is currently not using it and won’t be using it in the near future”.

    The end result is that he disassembles the field and he also disassembles the models on the field. But he should have just disassembled the field. Not the models.

    The challenge was that for the last 7 years the term “disassemble the field” in the team has grown to mean something different from what a relatively new team member will understand. We as a team made an assumption that Ivo will distinguish between “disassemble the field” and “disassemble the models”. For Ivo on the other hand the term “disassemble the field” means “disassemble both the field and the models”

    Nobody on the team could imagine that we should be explicit about this and ask Ivo to differentiate between them. As a result we’ve lost about 4 days of work and about $700. It would be cheaper to buy new models than to try to assemble again the disassembled models.

     
c
compose new post
j
next post/next comment
k
previous post/previous comment
r
reply
e
edit
o
show/hide comments
t
go to top
l
go to login
h
show/hide help
shift + esc
cancel