I'm not sure what that is.....any elaboration?
Thanks!
Printable View
I'm not sure what that is.....any elaboration?
Thanks!
Is the number of computers and/or tasks a fixed amount that doesn't change from run to run? If not then I'm stumped because all I can think of to do is test every possibility, but you wouldn't know how many nested loops to use. Have you learned dynamic programming yet? Its possible theres a dynamic programming solution...
Ok, now I'm interested in this. I would like to help but I need a better definition of the "problem statement". I'm not getting a clear enough picture from the postings so far.
Pretend like like its an assignment and you're the prof - I'd love to put my hand at a solution :)
(Sometimes, just coming up with a consise problem statement reveals a solution.)
gg
Yes, the number of computer and tasks may both change from run to run.
In general, the problem statement is that we want to schedule as many jobs as possible. Each job may be scheduled on a number of computers (the same number for each job, however not necesarily the same computer). In other words, job 0 may be scheduled on computer a and b, while job 1 may be scheduled on computer c and b. Upon input we know the number of computers total, and the number of computers associated with each job.
Because a job may be scheduled on more than one computer, it has options if a conflict arises. If it can be on computers a and b, but the times are used on a, perhaps it can fit in those times on b.
Hopefully this clears some things up.
Ask if you have additional questions.
Ok, I think I see whats going on here. A few questions:
(1) does the load need to be balanced or just complete as many tasks as possible?
(2) what if the combination of computers and tasks end up having tasks that can't run? For example, 3 tasks that overlap in their times (somewhere), and only 2 computers to schedule them on.
(3) if there aren't enough resources (computers) for the given number of tasks, do the tasks have any kind of priority (if the awnser to 1 is "complete as many as possible" then the awnser is "yes".....I'll explain...)
gg
1. The main goal is to schedule as many tasks as possible. There does not need to be a balance.
2. This is certainly a possibility. Many test scenarios have jobs that simply can't be scheduled because of time overlaps. If the maximum number of jobs is fewer than the total number of jobs, so be it.
3. There is no priority associated with the jobs.
Forget about what I said about priority - its not needed.
A greedy type algorithm is good way to go. I think what you need is a different representation of the problem. I'm also going to give you somewhat of a proof to show that the algorithm will always schedule as many tasks as possible. Let me lay some ground work for the proof'age. (And please, no math-types ripping on my terminology, including "proof" :) )
Definition: Two tasks intersect if any part of their scheduled times overlap.
Theorem: The maximum number of intersecting tasks is also the number of computers needed in order to schedule all tasks.
Proof (by induction, sorta): For a single task A, one computer is needed to schedule it. If a second task B intersects A, then B can not be scheduled on A's computer so a second computer is needed to schedule B. If a third task C, intersects both A and B, then C can not be scheduled on either A's or B's computer and therefore needs a third computer to be scheduled. (3 intersecting tasks, 3 computers are needed, and so on and so an)
With this in mind, we don't have much to maximize since the only time a task will not be scheduled is if it intersects with more tasks than there are computers.
On with the problem representation: Have an array of vectors, one for each computer. The first index in this array contains the ordered tasks that are scheduled on the first computer and so in. The algorithm is then: for each computer in the array, attempt to schedule the next task from the pool of available tasks. If the task intersects with a task already scheduled on that computer, move on the next computer in the array. If the task to be scheduled has an intersection on all computers, then you know that task can not be scheduled without adding another computer (see theorem/proof).
Here's some pseudo-code to drive it home:
The may be similar to (if not identical) to your previous algorithms, but hopefully this representation and the logic of the "proof" will give you some assurance that all tasks that can be scheduled will be. Hopefully I haven't overlooked some basic requirement that blows all this out of the water.Code:Computers comps[NUM_COMPUTERS]
TaskPool tasks
WHILE taskPool not empty
Task t = tasks.getNextTask
//look for first available computer to schedule t
FOR each c in comp DO
IF t intersects one of c's scheduled tasks
continue
ELSE
schedule t on c
break
ENDIF
ENDFOR
ENDWHILE
gg
[edit]
I have 69 posts...he he
Yeah, that was one of the greedy algorithms I proposed. Unfortunately what screws up any greedy algos I can think of is the problem that each task has a list of computers that it may work on which isn't necessarily ALL the possible computers. If you sort by end times which is required for the greedy algorithm, you get wrong results with something like:
Task 1: Start 1:00 End 2:00 possible comps: A B
Task 2: Start 1:00 End 3:00 possible comps: A
And if you try and sort by total possible computers, then you get screwed up with something like:
Task 1: 1:00 End 3:00 possible comps: A
Task 2: 1:00 End 2:00 possible comps: A B
Task 3: 1:00 End 2:00 possible comps: A B
Task 4: 2:00 End 3:00 possible comps: A B
Task 5: 2:00 End 3:00 possible comps: A B
So I don't think any greedy algorithm will work unfortunately... and unless you know at compile time how many comps there are, I'm stumped as to how to do this iteratively.
My algorithm and representation works and solves the problem as I understand it - doesn't matter if the number of computers are known at compile time or run time - it works if the state is maintained from run to run and you add/remove tasks/computers.
If you think my algorithm is flawed somehow then please prove it using my algorithm and a test set of data where not all tasks get scheduled that could have been (again, using my algorithm).
You can't say all greedy algorithms will fail because a particular greedy algorithm fails.
gg
I agree that your algo works when a task can work on all computers but as far as I can tell not when a task has only a small possible number of computers it is allowed on. I'm curious in what order your algo picks the tasks. But one example is the first one I gave above:
Task 1: Start 1:00 End 2:00 Allowable computers: A B
Task 2: Start 1:00 End 3:00 Allowable comptuers: A only
Your algo picks task one, puts it on the first available computer which is comp A. It then picks task 2, and now there are no computers it can work on since A is full and it is only allowed on A, so skips it. But the ideal solution is Task 1 on comp B and Task 2 on comp A.
I'm not necessarily saying that all greedy algo's fail, I'm just saying that I can't think of one. A main factor in a greedy algo is the the best choice at a time is always the correct choice. Now best is always subjective, but I can't think of a single way to always declare without a doubt a best choice. But I'd be happy to eat my words if you can think of one! :D
BTW where do you live in Georgia? I used to live in Woodstock just outside of Atlanta (and the rest of my family still does).
Ok, there's the "overlooked basic requirement" - that there are some tasks that can only be run a specified subset of computers.
Well, I have to go take a shower (TMI!), but in the meantime, can someone elaberate on this requirement. For example, are some tasks "pre-assigned" to a single computer or subset of computers?
Also, does this algorithm need to preserve state from run to run - in other words, does it need to support dynamically adding/removing tasks/computers - or does it simply have a set of data and it needs to come up with the optimal solution in a single run?
gg
[EDIT]
I just moved from Alpharetta to Canton, right above Woodstock....small world.
I'm not exactly sure what you mean by pre-assigned, but basically each task has a subset of computers that it is allowed on, with no preference being made which of those comps in the subset it chooses. And I'm pretty sure that it is a one-time run and find the answer type of thing, although bigDaddyDrew can answer that better than I.
Actually my parents just last year moved to a new house and now their official city address IS Canton! They live just off Sixes Road and hwy 575. VERY small world. :)
By pre-assigned, I meant, are all the subsets of size 1 (no).
So, some tasks can run on any computer, and some tasks are restricted to running on 1 or more of the available set of computers. (Re-reading the posts, this was obvious - doh).
The hard part will be proving that a solution is optimal. Let me crank on this and see if I can come up with something.
gg
Hi guys,
Sorry for a bit of an absence...here are some answers or updated requirements to attempt to clear things up:
All tasks have a list of potential computers to be scheduled upon. The size of this list is the same for all tasks. So, for example, if there are 8 total computers, and if each job can be scheduled on, say, 3 computers, we might have something that looks like this:
0 2 a b c
1 3 c f g
0 4 d a h
4 5 e b g
etc....
The beginning of the input file actually has two integers before it has the list of jobs. The first integer represents the number of machines per job (3 in our example) and the second integer represents the total number of computers (8 in our example).
In addition, there is no changing of jobs, etc, at runtime. Everything is read it from a file, then the input is all processed, without having to worry about alteration later.
I have a question for you guys:
You say if an itersection exists, attempt to schedule the job on the next available computer. What do you mean "next available computer" ? I guess this might be hard to define since the computers associated with each job aren't in any particular order (see example above).
If you have other questions, please let me know.
Oh, and btw, if anyone wants to see my recursive solution (turn that into iterative?!) then ask, and I will post it.
Thanks
We're not doing home work are we?
gg