# Thread: Scheduling - Collision Detection

1. ## Scheduling - Collision Detection

Hello,

This is an interesting question...one that I have been battling with for a few days now.

I am writing a program to schedule jobs on computers. (All hypothetical). The jobs are listed with start and end times, and the computers that they may be run on.

The goal is to schedule as many of these jobs as possible. I have solved it by writing a recursive function that calls itself by scheduling every job on each of its potential computers and not schdeduling it, then recursing to all remaining jobs. In an effort to cut runtime, I am pruning my recursive call tree by not attempting to schedule a job which conflicts with an already scheduled job.

This is where my question comes into play: I know there is a way to check for job conflicts in a constant amount of time, but I can't figure out how. Currently, I have a vector for each computer, of which the elements are items representing the jobs that have been scheduled on that computer. These jobs have their start/end times listed. So, to check for a collision, I am iterating through the computer vector, and checking each element's start/end times to see if there exists a conflict with the job that I am attempting to schedule.

Does anyone have a suggestion on how I may use a better data structure, or some other method to detect if a collision exists?

Thanks!

2. I'm not sure how to speed up the detection process unless you sort the items so you don't have to check every single one. But if you are interested there is an easier way to solve the entire scheduling question.

This problem is an easy one to solve with a greedy algorithm. Basically you sort all of the jobs by ending time, pick the first item, then iterate through your sorted list finding the first job that starts after that ending time. Continue to do this until you can't place any more jobs and you got your answer!

Greedy algorithms are nice when they work

3. Ackk... my bad. I missed that you said you are scheduling for multiple computers, which changes things a bit. No greedy algorithm is gonna solve that, sorry!

I have seen the algorithm that solves this before though, let me see if I can find it.

And while I am thinking about it, you could also have a bool vector/array that contains an element for each space of time. When a space is filled, set it to true. That way you would only have to iterate through the times to see if they are all set to false.

4. I was thinking about sorting them at insertion, like with a priority queue, or something, but, I'm not sure what to sort on, etc. Any suggestions?

Regarding your other proposed solution: That's next! I need to solve this problem recursivley, and iteratively....so, I'll tackle that solution once my recursion is as speedy as possible.

5. ok consider this...

If there are n computers create n ques.. Now when you get a new job to be schedulde.. find out from all the que the time takesn to start this job..(you cen get this based on the already pending jobs in each que).. now push this job into the que which returned the minimum time....

You can even simulate this by putting a delay corresponding to the amount of time the job takes.. and then pushing the job from the que...

The que needs to be circular..

6. Oops, check out my edit above which was posted after you wrote. Yeah, I'm not sure what to sort on if you wanted to do it this way, but now that I think about it it really wouldn't matter. Just pick to sort by beginning time or ending time and you would likely get the same results either way. A problem with a priority queue is that it really isn't sorted all that well beyond the first couple items. You're probably better off just using a vector and a function that inserts into it using a divide and conquer approach. But you are almost certainly better off using a vector that has an element for each unit of time so that you only have to check the indexes that correspond directly with the times you are checking.

7. Ok, so while at lunch, I thought of this:

As I read in the jobs from file, order them based on end time. (put them in a priority queue, heck, even do insertion sort, if I watned...) This will be slow (worst case O(n^2), but I only do it once...).

I'll still use a vector for each computer...

Recurse from the begining of the job list. If I schedule a job, push it onto the corresponding computer vector. Each time I schedule a job, look only at the last element in the computer vector. I know this last element is guarenteed to end later than all the other jobs in the vector. If the job I am attempting to schedule starts AFTER the end time of the last job in the computer vector, then I may schedule the potential job. If it doesn't start after the last job's endtime, then I may not schedule that job.

What do you guys think? Can you prove this will/will not work? Seems good to me!

8. Yeah, I was just about to suggest something like that. I was beginning to think that a greedy algorithm would indeed work on multiple computers. Try these two different approaches which may be what you were thinking and I'm willing to bet that at least one of them (if not both) will produce the optimal solution.

1. Do the greedy algorithm I said above one computer at a time, always placing the first available job that starts after the last placed job for the computer from the sorted by end time list of jobs. When this computer is full (you've iterated through the entire list of jobs), move to the next. Do this until no more computers.

2. Same thing more or less, except you iterate through the list of sorted jobs and place the job in the first available computer that can hold it. If no computers can take it, skip it and move onto the next job. Do this until you have iterated through the entire list.

Neither of these need recursion though, so if the problem states that you need recursion just save this (if it works lol) for the iterative solution.

9. Yeah,

The method that I'm currently using is your #2, although #1 has appeal as well! My only concern is that my data structures/ordering method will produce the results that I want...and I'm still debating that before I go implement it.

10. Well, for #2 a priority queue would work just fine if you set it up to always return and pop the job with the earliest end time, just overload the > and < operators if you are using classes. You couldn't really use a priority queue for #1 though since you want to be able to iterate and pull things from the middle, something not meant for a p-queue. That one I'd suggest just putting them all into a vector then sorting using quicksort or whatever you are comfortable with.

11. Actually, the priority queue would be less efficient then the quicksort either way. You would only need the p-q if you plan on adding things throughout the program, which you aren't. Since everything is done at the very beginning, putting everything into a vector and quicksorting would be best imo.