NotificationsYou must be signed in to change notification settings
Fork16
Star8

[issue-858] GPU implementation of 911 vertices#880

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Jump to bottom

Draft

NicolasJPosey wants to merge47 commits intoPoseyDevelopment

base:PoseyDevelopment

Choose a base branch

fromissue-858-911vertices-gpu-implementation

Draft

[issue-858] GPU implementation of 911 vertices#880

NicolasJPosey wants to merge47 commits intoPoseyDevelopmentfromissue-858-911vertices-gpu-implementation

Conversation

Copy link

Contributor

NicolasJPosey commentedAug 27, 2025

Closes #

Description

Checklist (Mandatory for new features)

Added Documentation
Added Unit Tests

Testing (Mandatory for all changes)

GPU Test:test-medium-connected.xml Passed
GPU Test:test-large-long.xml Passed

NicolasJPosey added30 commits

June 23, 2025 11:23

Add loadEpochInputs to OperationManager

08dd027

Adds support for two uint64_t argument functions so that loadEpochInputs can be registered and called from the OperationManager class.

Add vertices device struct to 911 class

a34356f

Merge remote-tracking branch 'origin/PoseyDevelopment' into issue-858…

b02e61a

…-911vertices-gpu-implementation

Add total number of events data member to InputManager

aeef7d4

Added a data member for storing the total number of events that are read into the InputManager. This allows us to define vector capacities based on the number of events being simulated.

Initial CPU GPU architecture documentation

f58ad17

Refactor of loadEpochInputs to support loading inputs to GPU

b456abc

AllVertices now has a non-virtual loadEpochInputs method. This calls two virtual methods, one for loading the epoch inputs and other for copy the inputs to the GPU. The default behavior for both is to do nothing.

Refactor getEdgeToClosestResponder method to be a All911Vertices meth…

8b54e96

…od instead of a connections methodThis method makes more sense to be a behavior of vertices as the behavior is also needed to be run on the GPU.

Forgot CPU code

6340556

Some GPU implementations but is incomplete

6cbb595

Merge remote-tracking branch 'origin/PoseyDevelopment' into issue-858…

9209e53

…-911vertices-gpu-implementation

Remove reserve call since RecordableVector doesn't implement it

bab8265

Refactor internal vector use in PSAP and RESP advance logic

4eba0c1

We need a dynamically sized array use we use a vector instead of an array. But we want the implement to be easily mirrored on the GPU so we interact with the vector like we would with an array.

Convert call metrics to EventBuffers and swap push back for insert ev…

9b7f77f

…ent callThe push back call was not easy to mirror on the GPU. We already have examples of using the EventBuffer and insertEvent call on the GPU so we change to use this implementation. This also allows us to make it clear what size the buffer should be, again helping the mirrored GPU implementation.

Replace numeric bool with actual bool for readability

bba175c

Change vector type from RecordableVector to EventBuffer

46f4000

This allows us to remove the resize calls which we don't want to do on the GPU. Also added a DoubleEventBuffer to use in place of RecordableVector<double>.

Merge remote-tracking branch 'origin/PoseyDevelopment' into issue-858…

03f598d

…-911vertices-gpu-implementation

Bug fix for copying spike histories from device

9d07ab0

The correct pattern is to first copy the device pointers to the CPU and then the values to the CPU data members. It happens to be the same that the number of bytes for a uint64_t and a uint64_t pointer are the same. However, if this pattern is repeated for a type like float, an illegal memory error is thrown.

Add a guard and debugging message for GPU random noise

d8b74c2

The GPU noise array only works for numVertices >= 100 and that are a multiple of 100. Otherwise, an invalid kernel configuration error is thrown which masks other possible errors.

Updates to support copying to and from GPU

b055214

Support for copying to and from GPU and make type float for now

c3893e5

Add GPU 911 vertices to make list

3fcf848

Implementation runs but results aren't quite right

944c941

Fix case sensative copying of call responder types

45d2f31

Fix bug using wrong size for queue length and utilization histories

ba7d73a

Remove debugging printfs and replace asserts with printfs for errors

815e5b3

Having asserts in kernels can cause them to fail silence. Using print statements and returning is a better way to fail inside kernels.

General cleanup

1b9541e

Clean up of commented out code, unnecessary extra variables, and unused methods.

Free the array used to determine available servers and units in kernels.

df7d22b

Readd support for getting dropped calls

d9f1b07

The update in vertex creation to make each vertex have the same sized data member for the GPU made it so that we would never get a dropped call due to large queue sizes. The logic was changed were we interact with vertex queues in PSAPs and RESPs to act like the size is equal to the number of trunks which was the original implementation size.

Fix error if a dropped call is found after the first epoch

815f79a

RecordableVectors are cleared after each epoch if they are the dynamic type. The size is not reset. We need subscript operator access for droppedCalls so the type much be constant which does not clear the vector after each epoch.

Support for noise in 911 models

6da281f

The current implementation for generating noise on a device has some assumptions that break for the 911 GPU model. To get noise support for 911, we implemented a way to have vertices specific how many noise elements they will need. A method was then implemented in GPUModel that rounds the input up to the nearest multiple of 100.

NicolasJPosey added17 commits

August 29, 2025 22:24

Add assert for random number thread count

0222e4b

Add support for using noise to simulate attempted redials

9c5ab92

Because only caller regions simulate attempted redials, we add a vector to map the caller region vertex IDs to the noise array on the device. This allows us to use the existing noise algorithm with larger graphs since we can only generate noise for up to 10000 vertices.

Fix isFull error message to show right buffer size

efe6790

Fix bug with waiting queue check

d1465cc

If the number of trunks and servers is equal and the queue is full, capacity minus busy servers is negative. Since dstQueueSize is of type uint64_t, it can't be negative. The comparison then gives a false positive that the queue is not full. Fix is to cast the size to an int so that the right comparison is done.

Debugging statements for memory analysis

9c76eaf

Add some larger 911 graphs

ce3ebae

Updates to history to support less memory usage on GPU

6a34f93

The call metrics account for the vast majority of the physical memory used by the GPU. By resizing each to a smaller value, we can fit larger graphs on the GPU by using more epochs with smaller steps per epoch.

Fix firing rate value

b228eae

Firing rate should actually be equal to 1 since we can have at most 1 call per second.

Fix issue using wrong buffer size

8efb90e

The buffer size used for a CircularBuffer is 1 more than the capacity passed into the constructor. When we construct the buffer, we pass in the number of trunks but were effectively using 1 less during the simulation.

Fix getting front index when we want end index for queue length calcu…

6365ff3

…lation

Add back in random redial attempt

a8e49ce

More updates to reduce memory usage

1f16f36

Metrics that used totalNumberOfEvents and totalTimeSteps were using more memory than needed. These were changed to maxEventsPerEpoch and stepsPerEpoch respectively. Also changed copyTo and copyFrom in All911Edges to use heap memory to prevent stack overflows with large graphs.

Fix bug with vertex queue size

495b526

The buffer inside the CircularBuffer implementation is 1 larger than the capacity set at construction. VertexQueues are CircularBuffers so we add 1 where we use the buffer size.

Another CircularBuffer size bug fix

563c28f

Fixed allocation, copyTo, and copyFrom for VertexQueues. They are CircularBuffers which internally have a buffer that is 1 more than the capacity. The sizes used were updated to be 1 more than the stepsPerEpoch to match the construction capacity.

Fix firing rate and change epoch parameters to reduce memory

543412c

Memory is mostly dependent on epoch duration so we decrease that parameter and increase the number of epochs parameter by the same factor. This keeps the total time steps constant but reduces memory usage. We can only have 1 call per step so the max firing rate should be 1.

Add an approximate state wide, month long configuration

2733dfc

General cleanup and adding of comments

8e2b5c3

Labels

None yet

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[issue-858] GPU implementation of 911 vertices#880

Are you sure you want to change the base?

[issue-858] GPU implementation of 911 vertices#880

Uh oh!

Conversation

NicolasJPosey commentedAug 27, 2025

Description

Checklist (Mandatory for new features)

Testing (Mandatory for all changes)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants