Wrong / duplicate number build

Hi, we have a strange bug in build sequence where it goes #1078 #1080 and #1080 again:

Not a big deal but there is definitely a bug in build numbers here. When clicking on build to see details it does not fix the problem, both builds will claim to be #1080.

https://www.bitrise.io/build/81be8e5c1a5f8bf6 should have been #1079, and https://www.bitrise.io/build/fafe18bbd8e677da the only #1080.

1 Like

Oops, that’s indeed weird! @erosdome could you take a look at this please?

1 Like

@daniel - @erosdome is on sick leave right now, but I’ll schedule this for the web team :wink:

I have webhooks set up from Github for both pushes and PRs. Just now when I pushed a change to Github two unusual things happened. First, for the first time ever, the PR build started first and then the push build. Second, both builds have the same build number.

I’ll add a screenshot if I can figure out how to do it.

Thanks!

1 Like

1 Like

We’re aware of this issue and will fix it ASAP, but unfortunately this is quite a complex issue, so it might take some time, but should be fixed this week.

A bit of technical info: right now the build number is calculated by the count of builds in the database for the given repository, then stored in the build’s db entry when the build starts. Now the issue is that we now have quite a few cases when “you start builds at the same time but you delete a build right away”. E.g. when you have a webhook registered for both PR and code push GitHub/the git service sending the hook can send the two hooks virtually at the same time. Bitrise.io will register the trigger intent and start to process it, and at some point it’ll create a DB entry for it. The build trigger handling is one of the most complex parts of the backend so this might take quite a bit of time, and as the system is highly distributed (the two webhook calls will run on completely separate servers, so the processing can happen at the exact same time) this can lead to a situation when: Trigger1 saved the Build into DB, and just a bit later Trigger 2 also saved the Build into DB, but then Trigger1 gets to a check which determines that the build should not be started (e.g. based on the Trigger Map) and deletes the related build. Now Trigger2’s build will have a build number which was calculated when T1’s build DB existed, but as it was removed the next trigger will have the same number. Using a single DB entry for storing and allocating the build number would also suffer from a similar issue, just the other way around, that certain build numbers would simply be not allocated (because the number was allocated but then the related build DB entry was removed).

The only sane way for this is to calculate the build number and save into the DB as late as possible in the algorithm, which requires quite a bit of careful revisioning & time, but we’ll definitely work on it this week, and should be fixed by the end of the week (unless there’s an unexpected edge case).

Thanks for reporting!

We just deployed a fix which should solve this issue.

@guperrot & @steven-at-truemotion please let us know if you’d still see this happen!

I’ll activate a one week auto close on this issue, if we don’t get any reports we consider this to be fixed. Once this issue is closed feel free to create a new one if you’d see this happen again, or if you’d have any other issue!

Thanks for reporting, and Happy Building! :slight_smile:

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.