Skip to main content

Hello All!

I wanted to start a thread in the community to discuss implementations and solutions to the Kaggle X-Mas challenge. For those of you who don’t know, Kaggle it is a site intended for promoting data science through courses and challenges. Every X-Mas they have a special optimization challenge. You can find more information below on this years challenge in the link below:

https://www.kaggle.com/c/santa-2021/overview

A quick summary is: “Santa is creating a new streaming service and will play a 24/7 loop of 7 films. He would like the showing sequence to be optimal. Hence 3 groups of elves are watching all possible sequences (7! = 5040) of 7 movies to evaluate what is the best sequence.

So, it has to do with permutations and TSP’s. Permutations, because we need to identify all possible combinations of sequences. And TSP’s because the problem can be transformed to a multi Traval Salesman Problem. Every salesman in this case is a group of Elves, and every city is a sequence of (7) films that must seen by at least 1 group (5040 cities in total). In order to optimize the sequence, we can evaluate the distance between permutations as below:

For example, the two sequences “1234567” and “4567123” could be connected as 123{4567]123}. The films between h ]’s are the first sequence, and the films between { }’s are the second sequence. Hence if they watch the films in this order (1234567123) they are able to watch 2 (7 long) sequences in one go, viewing a total of 10 movies as opposed to 14 (7 x 2), improving the total necessary time to watch all sequences.

This is a difficult problem. TSP is already a challenging problem. Having multiple salesmen, a high quantity of cities and asymmetrical distances only makes it harder.

There are two additional business rules that need to be taken into account:

  1. All elf groups must watch a subset of the permutations (that start with 2 specific movies). So there is overlap in visiting these specific cities, but not all of them (120 of the total 5040).
  2. There are 2 wildcards that can be placed amid a sequence. That wildcard serves as any of the movies; hence you can cover a lot by using this.

I will share a project and implementation I have done for this below and would love to discuss on how to solve this and improve!

Thanks

 

As mentioned above, here is my initial implementation to solve this challenge in AIMMS:

https://github.com/luispintoaimms/aimms-santa-2021

I implemented some heuristics in order to make running times feasible. They are:

  • Selecting the subset of permutations that will be covered by each elf group with a greedy algorithm
  • Cutting distances that are two costly (can be configured)

Due to this, the solution will not be globally optimal.

To run, you can simply use the MainExecution.

I would love to hear feedback and discuss improvements.


Hey Luis,

I tried a constructive approach for this problem, but unfortunately, didn't get competitive results.

My best solution was 2533.

But here are a few insights that might help reduce your TSP implementation:

For each permutation, there are only one other that has only 1 char of diference:
Consider the permutation ABCDEFG, you can cycle one character to cover 7 permutations
ABCDEFG, BCDEFGA, CDEFGAB, DEFGABC, EFGABCD, FGABCDE, GABCDEF. The next one goes back to the start.
Which gives the following sequence ABCDEFGABCDEF.
You then can reduce the number of permutations from 5040 (7!) to 720 (6!). 

If I get some spare time, I will try your model :slight_smile:


Just pushed an update to the gitlab repository with improvements to the distance calculations made by @Marcel Hunting. It is now possible to generate all necessary data in less than a minute.

Thank you!

 


Reply


Didn't find what you were looking for? Try searching on our documentation pages:

AIMMS Developer & PRO | AIMMS How-To | AIMMS SC Navigator