Why I joined the Kogan.com engineering team

Welcome to our monthly “Why I joined the Kogan.com engineering team” series! Every month, we talk with members of the Kogan.com engineering team to learn who they are and why they chose to pursue careers at Kogan.com.

Mengfei is our Designer extraordinaire who has been with us for over 4 years now. Her role sees her playing a big part in how our website interacts with our customers and she is involved in turning features developed by our engineers into user friendly applications for our customers. In her past time Mengfei loves to bake, travel and play the viola. Her superpower is spotting our pet cats lurking around our team’s backgrounds during video calls! Mengfei is also an avid photographer who regularly shares her brilliant snaps of her travels with the wider Kogan.com team.

Mengfei at Lake Tyrrell which is in between Melbourne CBD & Mildura

Describe what it’s like being a part of the Kogan.com Engineering team.

Busy, collaborative and an environment where creativity is embraced.

Every day we receive many requests from other teams, as well as feedback from customers. We also have new features that need to be analysed, designed, implemented, and released. As a UX designer at Kogan.com, I am always busy with proposing concepts, receiving feedback, and refining designs. During this process, there is a lot of room to be creative, especially when collaborating with other team members.

We’re always brainstorming various ways to solve usability issues, enhance the User experience, and deliver the best e-commerce site for our customers!

What is something unique about the Kogan.com Engineering team?

Our team is highly collaborative and we love supporting each other.

I joined Kogan.com 4 years ago as my first job after migrating to Australia. We were a team of 14 members at that time. In my time here, I have learnt a lot not only about design, but also about stakeholder management, negotiating requirements with developers, and even about the Australian culture! 

During this time, I have gained so much knowledge about how customers interact with our site which helps with designing features for our customers so they have a great shopping experience.I have also been fortunate enough to be part of the team working on building new vertical sites aligning to our overall company growth strategy. 

4 years ago, when I joined Kogan, I received a lot of support from team members. I had very little knowledge about front-end development at that time. The developers took time to walk me through the process, explored the different tools, and we worked together to find an efficient process to turn designs to implementation. Goran, our CTO, was always kind with his time and supported me to review my designs. He always empowered me and raised suggestions when I got stuck. 

I am so proud to say that I have now become one of the key members in our team who supports others by providing these design suggestions. I also play a big part in the user acceptance testing process. This helps us to get familiar with products and features much faster.

Tell us about a work challenge you’ve had to recently resolve

Our team has doubled in size recently due to the growth of our business and we’re seeing more and more new faces join us! To ensure our new developers aren’t waiting for my designs, I’ve had to learn to manage my time more effectively so I can deliver on these designs for my team and stakeholders.

Tell us about your proudest moment at work

I am really proud to see the new features that I’ve designed, released and used by customers,(though this is the whole team’s effort not my own ^_^).

Debugging Celery Issues in Django

Lockdown has ended in Melbourne and we’re able to resume mingling, gossiping, and chattering. Just before we could get off our seats and out the door though, Celery (our distributed task queue) jumped the gun and had us all scratching our heads on a recent issue we uncovered at Kogan.

Most Django developers will use Celery at least to manage long running tasks (as in longer than what’s reasonable in a web request/response cycle). It’s easy to set up and easy to forget that it’s running, until of course something goes wrong.

We observed that at around midnight, hundreds of gigabytes of data was recorded as ingress to our RabbitMQ node hosted on AWS, wreaking havoc on available memory and CPU utilisation.

We continued to investigate. CloudWatch metrics unveiled that the data mass was originating from our worker autoscale group, narrowing the search down.

Introducing Celery mingling.

Celery keeps a list of all revoked tasks, so when a revoked task comes in from the message queue it can be quickly discarded. Since Celery can be distributed, there’s a feature enabled by default called mingling which enables Celery nodes to share their revoked lists when a new node comes online. On paper, this sounds like a good thing: If a task is revoked then it shouldn’t be executed! Our use case doesn’t involve revoking tasks so it seemed harmless - if by chance we wanted to revoke a task manually we’d be able to, knowing that it will propagate to all nodes.

Unfortunately there’s more to the revoked list than meets the eye. Here’s a snippet from Celery:

def maybe_expire(self):
        """If expired, mark the task as revoked."""
        if self._expires:
            now = datetime.now(self._expires.tzinfo)
            if now > self._expires:
                revoked_tasks.add(self.id)
                return True

Adding an expiration to tasks is something we do a lot as there are a lot of time sensitive actions to do. If we’ve missed the window, we shouldn’t run the task.

The above snippet shows that when an expired task comes in it gets added to the revoked list! As a result, when a new Celery node came online our existing workers were eager to share their 250MB lists with the new node. Keep in mind that this list is just a list of UUIDs: we have a lot of tasks executing! We quickly turned off this feature after we observed this behaviour. We also noticed that a lot of workers were restarting at midnight - 250MB multiplied by 30 workers restarting is a lot of handshakes and a lot of data!

Looking through the supervisor logs to find the cause of the restarts initially gave a red herring; processes were exiting with exitcode 0. Surprisingly, Celery will also exit 0 on an unrecoverable exception so we started looking through Sentry for anything suspicious. We uncovered this exception:

TypeError: __init__() missing 1 required positional argument: 'url'

The rest of the trace was unhelpful due to the coroutine nature of its source. At a glance the exception appears to be a bug on our end, but looking at the source reveals it to be a pickle deserialization error.

Ultimately, we found the issue was not an unpickleable class, but an unpickable exception being passed to the retry mechanism. We filed an issue and removed all custom exceptions from the retry method.

If you’re running Celery with a lot of fine tuning with task expiration, we recommend turning off mingling. We’d also recommend not passing custom exceptions into Celery’s retry mechanism and instead log exceptions where they’re initially raised.

Why I joined the Kogan.com engineering team

Why I joined the Kogan.com engineering team

Welcome to our monthly “Why I joined the Kogan.com engineering team” series! Every month, we talk with members of the Kogan.com engineering team to learn who they are and why they chose to pursue careers at Kogan.com.

Anita is the Talent Acquisition Lead at Kogan.com. She is a huge fan of coffee (strong long black please!) and is always on the lookout for the next greatest breakfast spot in Melbourne. In her time at Kogan.com, Anita has helped grow the team and introduce various internal & external community initiatives. Learn all about Anita, her team, and how she’s making an impact on #TeamKogan.

A glimpse into the life of a Software Engineer at Kogan.com

 A glimpse into the life of a Software Engineer at Kogan.com

A career in software engineering can open a lot of exciting doors, allowing you to support key business initiatives, create new software features and functionalities, and help to keep everything running effectively. For many software engineers, there's so much variety to their days, with the work they're carrying out changing from day-to-day (or even hour-to-hour).

We took a glimpse on what life as an engineer looks like at Kogan.com. Anita our Talent Acquisition Lead sat down with Software Engineer Michael to explore this. He shares his biggest learnings on the job so far, his career journey, and what life is like for him at Kogan.com.