Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ci flwadd #514

Open
wants to merge 82 commits into
base: ci_bigblade
Choose a base branch
from
Open

Ci flwadd #514

wants to merge 82 commits into from

Conversation

tommydcjung
Copy link
Contributor

No description provided.

tommydcjung and others added 30 commits January 11, 2021 15:21
* Move wormhole concentrator outside pod ruche array

* merge north and south vcaches traffic
extend cid to 2-bit
* clean up branch trace

* remote interrupt interface

* npc_r

* add mret encoding

* fcsr load goes to rs2_val

* interrupt csr test

* add mcsr

* interrupt trace test

* set pc_init_val by _start;

* interrupt remote test

* remote interrrupt icache miss

* trace interrupt icache miss test

* CR

* change remote interrupt eva

* interrupt trace countdown

* printing interrupt taken

* interrupt trace float

* mret display; debug_p

* more test; remote interrupt ptr

* interrupt trace jump loop

* interrupt trace jump loop icache

* interrupt dual source

* interrupt trace branch loop icache

* interrupt_trace_branch_mispredict_loop

* remote load loop with interrupt
* io rtr tag client

* set width_p for io rtr tag clients
* wh return fiof

* fix
* Added reset_done port to dpi interface

* Adds missing inputs to endpoint_to_fifos

* Removed pod x/y

* Adds CUDA kernel for DMA test

* Adds missing file

Co-authored-by: Dustin Richmond <[email protected]>
* temp commit

* fix regression

* offset dmem start addr by 8 bytes for interrupt handler

* CR

* cr

* adding .dmem.interrupt section; declaring _interrupt_arr in crt.S; update hello to use interrupt arr using extern

* add comment
* Adding trace interrupt test for idiv, fdiv and imul

* Adding remote interrupt tests with multiplier and divider

* Testing multiple remote interrupts, icache misses in handler and different handler return strategies

* Adding a interrupt test regression

* Deleting duplicates within the interrupt regression suite without contaminating author history

* Using a macro for the start code

* Makefile cleanup

* More regression makefile cleanup

* Modifying macro name

* Adding npc mret test

* Deleting duplicate files and noting original author in specific files

* Adding interrupt tests to the no recurse list

* Adding a mini threading test

* Directed test exposing the need for PR #463

* Adding a readme

* Minor makefile modifications

* Adding missing test to regression
tommydcjung and others added 16 commits March 30, 2021 08:12
* Rewrite Victim Cache Profiler Parser (#428)
* Fixed vcache profiler bugs with re-write
* fix tile parser to get correct min cycle no. for absolute total cycles calculation when using multiple tags (#480)
* Fix header_print_p (for bigblade)

Bugs:
* Order of operations: Python binds or higher than >
* Counting of repeated tags. Previously, only the last tag was counted
* Iteration-order aware tag window

New Features:
* Atomic Misses
* Response Stalls
* More documentation

* Remove deprecated code
* Small field name modifications
* Fixed issue with mismatched tags error
* Fixed issue where TG origin/dim were confused for Device origin/dim


Co-authored-by: Emily Furst <[email protected]>
* imul mux order

* fp_exe flush minimize toggles

* add comment

* add comment 2
* io rtr

* pod row refactor

* move bsg_tag out of pod

* remove unused port

* clean up

* move out wh ruche buffers

* move out west ruche buffers
* num_clk_ports_p for subarray

* fix x -> c

* fix

* add param in pod row
@tommydcjung
Copy link
Contributor Author

// Machine Format:
// rs1 rs2 rd opcode
// 0000000_?????_?????_111_?????_0000100
`define RV32_FLWADD_OP 7'b0000100
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Pulpino supports a post-increment instruction in their compiler
// here: https://github.com/taylor-bsg/pulp-riscv-gcc/blame/master/gcc/config/riscv/riscv.md#L3663
// this can be used as a reference for adding compiler support
// for instructions with sideeffects to addresses.


// Load & Store
logic is_load_op; // Op loads data from memory
logic is_store_op; // Op stores data to memory
logic is_load_op; // Op is lw or flw
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should now add flwadd?

| (instruction_i.funct7 == `RV32_FCVT_S_F2I_FUN7)); // FCVT.W.S, FCVT.WU.S
end
`RV32_FLWADD_OP: begin
decode_o.write_rd = (instruction_i.funct7 == 7'b0000000) & (instruction_i.funct3 == 3'b111) & (instruction_i.rs1 != '0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

presumably part of this redundant with the fact that we are in the RW32_FLWADD_OP case statement?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably can just add FLWADD_OP to standard set of cases at the top of the casez statement?

decode_o.write_rd = (instruction_i.funct7 == 7'b0000000) & (instruction_i.funct3 == 3'b111) & (instruction_i.rs1 != '0);
end
`RV32_SYSTEM: begin
decode_o.write_rd = (instruction_i.rd != '0); // CSRRW, CSRRS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

combine with case above?

always_comb begin
unique casez (instruction_i.op)
`RV32_BRANCH, `RV32_STORE, `RV32_OP: begin
decode_o.read_rs2 = 1'b1;
end
`RV32_FLWADD_OP: begin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

combine with above?

decode_o.read_frs2 = 1'b0;
decode_o.read_frs3 = 1'b0;
decode_o.write_frd = 1'b1;
decode_o.is_fp_op = 1'b0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment why FLWADD is not an fp op

@@ -97,7 +97,7 @@ module lsu

assign dmem_v_o = is_local_dmem_addr &
(exe_decode_i.is_load_op | exe_decode_i.is_store_op |
exe_decode_i.is_lr_op | exe_decode_i.is_lr_aq_op);
exe_decode_i.is_lr_op | exe_decode_i.is_lr_aq_op | exe_decode_i.is_flwadd_op);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now is_load_op includes is_flwadd_op right?

@@ -152,7 +152,7 @@ module lsu


assign remote_req_v_o = icache_miss_i |
((exe_decode_i.is_load_op | exe_decode_i.is_store_op | exe_decode_i.is_amo_op) & ~is_local_dmem_addr);
((exe_decode_i.is_load_op | exe_decode_i.is_store_op | exe_decode_i.is_amo_op | exe_decode_i.is_flwadd_op) & ~is_local_dmem_addr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above?

wire int_remote_load_in_exe = remote_req_in_exe & exe_r.decode.is_load_op & exe_r.decode.write_rd;
wire float_remote_load_in_exe = remote_req_in_exe & exe_r.decode.is_load_op & exe_r.decode.write_frd;
wire float_remote_load_in_exe = remote_req_in_exe & (exe_r.decode.is_load_op | exe_r.decode.is_flwadd_op) & exe_r.decode.write_frd;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_load_op now includes is_flwadd_op?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wire int_remote_load_in_exe = remote_req_in_exe & exe_r.decode.is_load_op & ~exe_r.decode.is_flwadd_op & exe_r.decode.write_rd;

((id_r.decode.read_frs1 & (id_rs1 == exe_r.instruction.rd) & exe_r.decode.write_frd)
|(id_r.decode.read_frs2 & (id_rs2 == exe_r.instruction.rd) & exe_r.decode.write_frd)
|(id_r.decode.read_frs3 & (id_rs3 == exe_r.instruction.rd) & exe_r.decode.write_frd));


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like the above is redundant if we include flwadd in local_load_in_exe

Copy link
Contributor

@taylor-bsg taylor-bsg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High level feed back: the code is simpler if is_load includes flwadd; (which is currently the case, but the code does not reflect this)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants